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fS (54) Title: MOLECULES FOR DISEASE DEHE<rnON AND TREATMENT 

^ (57) Abstract: The present invention provides purified disease detection and treatment molecule polynucleotides (mddt). Also 

^ encompassed are the polypeptides (MDDT) encoded by mddt. The invention also provides for the use of mddt, or complements, 
oligonucleotides, or fragments thereof in diagnostic assays. The invention ftnther provides for vectors and host cells containing mddt 

Q for the expression of MDDT. The invention additionally provides for the use of isolated and purified MDDT to induce antibodies 

^ and to screen libraries of compounds and the use of anti-MDDT antibodies in diagnostic assays. Also provided are microamys 

^ containing mddt and methods of use. 
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1 6. A microarray wherein at least one element of the niicroarray is a polynucleotide of cl^fm 



1 7. A method for generating a transcript image of a sample which contains polynucleotides, 
the method comprising the steps of: 

a) labeling the polynucleotides of the sample, 

b) contacting the elements of the microarray of claim 1 6 with the labeled polynucleotides of 
the sample under conditions suitable for the formation of a hybridization complex, and 

c) quantifying the expression of the polynucleotides in the sample. 



10 



1 8. A method for screening a compound for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a polynucleotide sequence ofc^^xm 1, 
the method comprising: 

a) exposing a sample comprising the target polynucleotide to a compound, and 
15 b) detecting altered expression of the target polynucleotide. 

19. A method of cjXim 6 for toxicity testing of a compound, further comprising 

(c) comparing the presence, absence or amount of said target polynucleotide in a first 
biological sample and a second biological sample, wherein said first biological sample has been 
20 contacted with said compound, and said second sample is a control, whereby a change in presence, 
absence or amount of said target polynucleotide in said first sample, as compared with said second 
sample, is indicative of toxic response to said compound. 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

1. Claims: 1-19 (all partially) 

Invention 1 

Isolated polynucleotide comprising a polynucleotide sequence 
with Seq.ID 1. methods, compositions and microarrays using 
said polynucleotide, as well as a recombinant polynucleotide 
comprising a promoter sequence operably linked to Seq.ID 1 
and a cell and a transgenic organism comprising such a 
recombinant polynucleotide, a purified polypeptide encoded 
by Seq.ID 1, a method of producing such a polypeptide, an 
antibody which specifically binds to this polypeptide as 
well as methods of identifying a test conroound using this 
polypeptide. 



2. Claims: 1-19 (all partially) 
Inventions 2-14 

Isolated polynucleotide comprising a polynucleotide sequence 
with Seq.ID 2, methods, compositions and microarrays using 
said polynucleotide, as well as a recombinant polynucleotide 
comprising a promoter sequence operably linked to Seq.ID 2 
and a cell and a transgenic organism comprising such a 
recombinant polynucleotide, a purified polypeptide encoded 
by Seq.ID 2, a method of producing such a polypeptide, an 
antibody which specifically binds to this polypeptide as 
well as methods of identifying a test compound using this 
polypeptide. 

..idem for Seq.IDs 3-14 
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MOLECULES FOR DISEASE DETECTION AND TREATMENT 

TECHNICAL FIELD 

The present invention relates to molecules for disease detection and treatment and to the use 
5 of these sequences in the diagnosis, study, prevention, and treatment of diseases associated with 
disease detection and treatment molecules. 



BACKGROUND OF THE INVENTION 

The human genome is comprised of thousands of genes, many encoding gene products that 

10 function in the maintenance and growth of the various cells and tissues in the body. Aberrant 
expression or mutations in these genes and their products is the cause of, or is associated with, a 
variety of human diseases such as cancer and other cell proliferative disorders. The identification of 
these genes and their products is the basis of an ever-expanding effort to find markers for early 
detection of diseases, and targets for their prevention and treatment. 

15 For example, cancer represents a type of cell proliferative disorder that affects nearly every 

tissue in the body. A wide variety of molecules, either aberrantly expressed or mutated, can be the 
cause of, or involved with, various cancers because tissue growth involves complex and ordered 
patterns of cell proliferation, cell differentiation, and apoptosis. Cell proliferation must be regulated 
to maintain both the number of cells and their spatial organization. This regulation depends upon the 

20 appropriate expression of proteins which control cell cycle progression in response to extracellular 
signals such as growth factors and other mitogens, and intracellular cues such as DNA damage or 
nutrient starvation. Molecules which directly or indirectly modulate cell cycle progression fall into 
several categories, including growth factors and their receptors, second messenger and signal 
transduction proteins, oncogene products, tumor-suppressor proteins, and mitosis-promoting factors. 

25 Aberrant expression or mutations in any of these gene products can result in cell proliferative 

disorders such as cancer. Oncogenes are genes generally derived from normal genes that, through 
abnormal expression or mutation, can effect the transformation of a normal cell to a malignant one 
(oncogenesis). Oncoproteins, encoded by oncogenes, can affect cell proliferation in a variety of ways 
and include growth factors, growth factor receptors, intracellular signal transducers, nuclear 

30 transcription factors, and cell-cycle control proteins. In contrast, tumor-suppressor genes are 
involved in inhibiting cell proliferation. Mutations which cause reduced or loss of function in 
tumor-suppressor genes result in aberrant cell proliferation and cancer. Thus a wide variety of genes 
and their products have been found that are associated with cell proliferative disorders such as cancer, 
but many more may exist that are yet to be discovered, 

35 DNA-based arrays can provide a simple way to explore the expression of a single 

polymorphic gene or a large number of genes. When the expression of a single gene is explored. 
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DNA-based arrays are employed to detect the expression of specific gene variants. For example, a 
p53 tumor suppressor gene array is used to determine whether individuals are carrying mutations that 
predispose them to cancer. A cytochrome p450 gene array is useful to determine whether individuals 
have one of a number of specific mutations that could result in increased drug metabolism, drag 
5 resistance or drug toxicity. 

DNA-based array technology is especially relevant for the rapid screening of expression of a 
large number of genes. There is a growing awareness that gene expression is affected in a global 
fashion. A genetic predisposition, disease or therapeutic treatment may affect, directly or indirectly, 
the expression of a large number of genes. In some cases the interactions may be expected, such as 

10 when the genes are part of the same signaling pathway. In other cases, such as when the genes 
participate in separate signaling pathways, the interactions may be totally unexpected. Therefore, 
DNA-based arrays can be used to investigate how genetic predisposition, disease, or therapeutic 
treatment affects the expression of a large number of genes. 

The discovery of new molecules for disease detection and treatment satisfies a need in the art 

15 by providing new compositions which are useful in the diagnosis, study, prevention, and treatment of 
diseases. 

SUMMARY OF THE INVENTION 

The present invendon relates to human polynucleotides encoding molecules for disease 
20 detection and treatment (mddt) as presented in the Sequence Listing. Some of the mddt uniquely 
identify genes encoding stractural, functional, and regulatory molecules for disease detection and 
treatment. 

The invention provides an isolated polynucleotide comprising a polynucleotide sequence 
selected from the group consisting of a) a polynucleotide sequence selected from the group consisting 

25 of SEQ ID NO: 1 - 14; b) a naturally occurring polynucleotide sequence having at least 90% sequence 
identity to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-14; c) a 
polynucleotide sequence complementary to a); d) a polynucleotide sequence complementary to b); 
and e) an RNA equivalent of a) through d). In one alternative, the polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-14. In another 

30 alternative, the polynucleotide comprises at least 60 contiguous nucleotides of a polynucleotide 

sequence selected from the group consisting of a) a polynucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-14; b) a naturally occurring polynucleotide sequence having at least 90% 
sequence identity to a polynucleotide sequence selected from the group consisting of SEQ ID NO:l- 
14; c) a polynucleotide sequence complementary to a); d) a polynucleotide sequence complementary 

35 to b); and e) an RNA equivalent of a) through d). The invention further provides a composition for 
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the detection of expression of disease detection and treatment molecule polynucleotides comprising at 
least one isolated polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of a) a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-14; b) 
a naturally occurring polynucleotide sequence having at least 90% sequence identity to a 
5 polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-14; c) a polynucleotide 
sequence complementary to a); d) a polynucleotide sequence complementary to b); and e) an RNA 
equivalent of a) through d); and a detectable label. 

The invention also provides a method for detecting a target polynucleotide in a sample, said 
target polynucleotide comprising a polynucleotide sequence selected from the group consisting of a) a 

10 polynucleotide sequence selected from the group consisting of SEQ ID NO:l-14; b) a naturally 
occurring polynucleotide sequence having at least 90% sequence identity to a polynucleotide 
sequence selected from the group consisting of SEQ ID NO: 1-14; c) a polynucleotide sequence 
complementary to a); d) a polynucleotide sequence complementary to b); and e) an RNA equivalent 
of a) through d). The method comprises a) hybridizing the sample with a probe comprising at least 20 

15 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the 
sample, and which probe specifically hybridizes to said target polynucleotide, under conditions 
whereby a hybridization complex is formed between said probe and said target polynucleotide* and b) 
detecting the presence or absence of said hybridization complex, and, optionally, if present, the 
amount thereof. In one alternative, the probe comprises at least 30 contiguous nucleotides. In 

20 another alternative, the probe comprises at least 60 contiguous nucleotides. 

The invention further provides a recombinant polynucleotide comprising a promoter sequence 
operably linked to an isolated polynucleotide comprising a polynucleotide sequence selected from the 
group consisting of a) a polynucleotide sequence selected from the group consisting of SEQ ID NO:l- 
14; b) a naturally occurring polynucleotide sequence having at least 90% sequence identity to a 

25 polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-14; c) a polynucleotide 
sequence complementary to a); d) a polynucleotide sequence complementary to b); and e) an RNA 
equivalent of a) through d). In one alternative, the invention provides a cell transformed with the 
recombinant polynucleotide. In another alternative, the invention provides a transgenic organism 
comprising the recombinant polynucleotide. In a further alternative, the invention provides a method 

30 for producing a disease detection and treatment molecule polypeptide, the method comprising a) 
culturing a cell under conditions suitable for expression of the disease detection and treatment 
molecule polypeptide, wherein said ceil is transformed with the recombinant polynucleotide, and b) 
recovering the disease detection and treatment molecule polypeptide so expressed. 

The invention also provides a purified disease detection and treatment molecule polypeptide 

35 (MDDT) encoded by at least one polynucleotide comprising a polynucleotide sequence selected from 
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the group consisting of SEQ ID NO: 1-14. Additionally, the invention provides an isolated antibody 
which specifically binds to the disease detection and treatment molecule polypeptide. The invention 
further provides a method of identifying a test compound which specifically binds to the disease 
detection and treatment molecule polypeptide, the method comprising the steps of a) providing a test 
5 compound; b) combining the disease detection and treatment molecule polypeptide with the test 

compound for a sufficient time and under suitable conditions for binding; and c) detecting binding of 
the disease detection and treatment molecule polypeptide to the test compound, thereby identifying 
the test compound which specifically binds the disease detection and treatment molecule polypeptide. 
The invention further provides a microarray wherein at least one element of the microarray is 

10 an isolated polynucleotide comprising at least 60 contiguous nucleotides of a pol3aiucleotide 
comprising a polynucleotide sequence selected from the group consisting of a) a polynucleotide 
sequence selected from the group consisting of SEQ ID NO: 1-14; b) a naturally occurring 
polynucleotide sequence having at least 90% sequence identity to a polynucleotide sequence selected 
from the group consisting of SEQ ID NO:l-14; c) a polynucleotide sequence complementary to a); d) 

15 a polynucleotide sequence complementary to b); and e) an RNA equivalent of a) through d). The 
invention also provides a method of using the microarray for generating a transcript image of a 
sample which contains polynucleotides. The method comprises a) labeling the polynucleotides of the 
sample, b) contacting the elements of the microarray with the labeled polynucleotides of the sample 
under conditions suitable for the formation of a hybridization complex, and c) quantifying the 

20 expression of the polynucleotides in the sample- 
Additionally, the invention provides a method for screening a compound for effectiveness in 
altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of a) a polynucleotide sequence selected 
from the group consisting of SEQ ID NO: 1-14; b) a naturally occurring polynucleotide sequence 

25 having at least 90% sequence identity to a polynucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-14; c) a polynucleotide sequence complementary to a); d) a 
polynucleotide sequence complementary to b); and e) an RNA equivalent of a) through d). The 
method comprises a) exposing a sample comprising the target polynucleotide to a compound, and b) 
detecting altered expression of the target polynucleotide. 

30 The invention further provides a method for detecting a target polynucleotide in a sample for 

toxicity testing of a compound, said target polynucleotide comprising a polynucleotide sequence 
selected from the group consisting of a) a polynucleotide sequence selected from the group consisting 
of SEQ ID NO: 1-14; b) a naturally occurring polynucleotide sequence having at least 90% sequence 
identity to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-14; c) a 

35 polynucleotide sequence complementary to a); d) a polynucleotide sequence complementary to b); 
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and e) an RNA equivalent of a) through d). The method comprises a) hybridizing the sample with a 
probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said 
target polynucleotide in the sample, and which probe specifically hybridizes to said target 
polynucleotide, under conditions whereby a hybridization complex is formed between said probe and 
5 said target polynucleotide, b) detecting the presence or absence of said hybridization complex, and, 
optionally, if present, the amount thereof, and c) comparing the presence, absence or amount of said 
target polynucleotide in a first biological sample and a second biological sample, wherein said first 
biological sample has been contacted with said compound, and said second sample is a control, 
whereby a change in presence, absence or amount of said target polynucleotide in said first sample, as 
10 compared with said second sample, is indicative of toxic response to said compound. 

DESCRIPTION OF THE TABLES 

Table 1 shows the sequence identification numbers (SEQ ID NO:s) and template 
identification numbers (template IDs) corresponding to the polynucleotides of the present invention, 
15 along with their GenBank hits (GI Numbers), probability scores, and functional annotations 
corresponding to the GenBank hits. 

Table 2 shows the sequence identification numbers (SEQ ID NO:s) and template 
identification numbers (template IDs) corresponding to the polynucleotides of the present invention, 
along with polynucleotide segments of each template sequence as defined by the indicated "start" and 
20 "stop" nucleotide positions. The reading frames of the polynucleotide segments and the Pfam hits, 
Pfam descriptions, and E-values corresponding to the polypeptide domains encoded by the 
polynucleotide segments are indicated. 

Table 3 shows the sequence identification numbers (SEQ ID NO:s) and template 
identification numbers (template IDs) corresponding to the polynucleotides of the present invention, 
25 along with polynucleotide segments of each template sequence as defined by the indicated '*start" and 
"stop" nucleotide positions. The reading frames of the polynucleotide segments are shown, and the 
polypeptides encoded by the polynucleotide segments constitute either signal peptide (SP) or 
transmembrane (TM) domains, as indicated. 

Table 4 shows the sequence identification numbers (SEQ ID NO:s) and template 
30 identification numbers (template IDs) corresponding to the polynucleotides of the present invention, 
along with component sequence identification numbers (component IDs) corresponding to each 
template. The component sequences, which were used to assemble the template sequences, are 
defined by the indicated *'start" and *'stop" nucleotide positions along each template. 

Table 5 summarizes the bioinformatics tools w^hich are useful for analysis of the 
35 polynucleotides of the present invention. The first colunm of Table 5 lists analytical tools, programs. 
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and algorithms, the second column provides brief descriptions thereof* the third column presents 
appropriate references, all of which are incorporated by reference herein in their entirety, and the 
fourth colunm presents, where applicable, the scores, probability values, and other parameters used to 
evaluate the strength of a match between two sequences (the higher the score, the greater the 
5 homology between two sequences). 

DETAILED DESCRIPTION OF THE INVENTION 

Before the nucleic acid sequences and methods are presented, it is to be understood that this 
invention is not limited to the particular machines, methods, and materials described. Although 
10 particular embodiments are described, machines, methods, and materials similar or equivalent to these 
embodiments may be used to practice the invention. The preferred machines, methods, and materials 
set forth are not intended to limit the scope of the invention which is limited only by the appended 
claims. 

The singular forms "a", "an", and "the" include plural reference unless the context clearly 
15 dictates otherwise. All technical and scientific terms have the meanings commonly understood by 
one of ordinary skill in the art. All publications are incorporated by reference for the purpose of 
describing and disclosing the cell lines, vectors, and methodologies which are presented and which 
might be used in connection with the invention. Nothing in the specification is to be construed as an 
admission that the invention is not entitled to antedate such disclosure by virtue of prior invention. 

20 

Definitions 

As used herein, the lower case "mddt" refers to a nucleic acid sequence, while the upper case 
"MDDT" refers to an amino acid sequence encoded by mddt. A "full-length" mddt refers to a nucleic 
acid sequence containing the entire coding region of a gene endogenously expressed in human tissue. 

25 "Adjuvants" are materials such as Freund's adjuvant, mineral gels (aluminum hydroxide), and 

surface active substances (lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole 
limpet hemocyanin, and dinitrophenol) which may be administered to increase a host's 
immunological response. 

"Allele" refers to an alternative form of a nucleic acid sequence. Alleles result from a 

30 "mutation," a change or an alternative reading of the genetic code. Any given gene may have none, 
one, or many allelic forms. Mutations which give rise to alleles include deletions, additions, or 
substitutions of nucleotides. Each of these changes may occur alone, or in combination with the 
others, one or more times in a given nucleic acid sequence. The present invention encompasses 
allelic mddt. 

35 "Amino acid sequence" refers to a peptide, a polypeptide, or a protein of either natural or 
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synthetic origin. The amino acid sequence is not limited to the complete* endogenous amino acid 
sequence and may be a fragment, epitope, variant, or derivative of a protein expressed by a nucleic 
acid sequence. 

"Amplification" refers to the production of additional copies of a sequence and is carried out 
5 using polymerase chain reaction (PCR) technologies well known in the art. 

"Antibody'' refers to intact molecules as well as to fragments thereof, such as Fab, F(ab*)2, 
and Fv fragments, which are capable of binding the epitopic determinant. Antibodies that bind 
MDDT polypeptides can be prepared using intact polypeptides or using fragments containing small 
peptides of interest as the inununizing antigen. The polypeptide or peptide used to immunize an 
10 animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of RNA, or synthesized 
chemically, and can be conjugated to a carrier protein if desired. Commonly used carriers that are 
chemically coupled to peptides include bovine semm albumin, thyroglobulin, and keyhole limpet 
hemocyanin (KLH). The coupled peptide is then used to immunize the animal. 

"Antisense sequence" refers to a sequence capable of specifically hybridizing to a target 
15 sequence. The antisense sequence may include DNA, RNA, or any nucleic acid mimic or analog such 
as peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as 
phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified 
sugar groups such as 2'-methoxyethyl sugars or 2 -methoxyethoxy sugars; or oligonucleotides having 
modified bases such as 5-methyl cytosine, 2-deoxyuracil, or 7-deaza-2'-deoxyguanosine. 
20 "Antisense sequence'' refers to a sequence capable of specifically hybridizing to a target 

sequence. The antisense sequence can be DNA, RNA, or any nucleic acid mimic or analog. 

''Antisense technology" refers to any technology which relies on the specific hybridization of 
an antisense sequence to a target sequence. 

A "bin" is a portion of computer memory space used by a computer program for storage of 
25 data, and bounded in such a manner that data stored in a bin may be retrieved by the program. 

"Biologically active" refers to an amino acid sequence having a stmctural, regulatory, or 
biochemical function of a naturally occurring amino acid sequence. 

"Clone joining" is a process for combining gene bins based upon the bins' containing 
sequence information from the same clone. The sequences may assemble into a primary gene 
30 transcript as well as one or more splice variants. 

"Complementary" describes the relationship between two single-stranded nucleic acid 
sequences that anneal by base-pairing (5'-A-G-T-3' pairs with its complement 3 -T-C-A-S'). 

A "component sequence" is a nucleic acid sequence selected by a computer program such as 
PHRED and used to assemble a consensus or template sequence from one or more component 
35 sequences. 
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A "consensus sequence" or "template sequence" is a nucleic acid sequence which has been 
assembled from overlapping sequences, using a computer program for fragment assembly such as the 
GELVIEW fragment assembly system (Genetics Computer Group (GCG), Madison WI) or using a 
relational database management system (RDMS). 
5 "Conservative amino acid substitutions" are those substitutions that, when made, least 

interfere with the properties of the original protein, i.e., the structure and especially the function of 
the protein is conserved and not significantly changed by such substitutions. The table below shows 
amino acids which may be substituted for an original amino acid in a protein and which are regarded 
as conservative substitutions. 



10 





Original Residue 


Conservative Substitution 




Ala 


Gly, Ser 




Arg 


His, Lys 




Asn 


Asp, Gin, His 


15 


Asp 


Asn, Giu 




Cys 


Ala, Ser 




Gin 


Asn, Glu, His 




Glu 


Asp, Gin, His 




Gly 


Ala 


20 


His 


Asn, Arg, Gin, Glu 




He 


Leu, Val 




Leu 


He, Val 




Lys 


Arg, Gin, Glu 




Met 


Leu, He 


25 


Phe 


His, Met, Leu, Trp, Tyr 




Ser 


Cys, Thr 




Thr 


Ser, Val 




Trp 


Phe, Tyr 




Tyr 


His, Phe, Trp 


30 


Val 


He, Leu, Thr 



Conservative substitutions generally maintain (a) the structure of the polypeptide backbone in 
the area of the substitution, for example, as a beta sheet or alpha helical conformation, (b) the charge 
35 or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. 

"Deletion" refers to a change in either a nucleic or amino acid sequence in which at least one 
nucleotide or amino acid residue, respectively, is absent. 

**Derivative" refers to the chemical modification of a nucleic acid sequence, such as by 
replacement of hydrogen by an alkyl, acyl, amino, hydroxyl, or other group. 
40 The terms "element*' and "array element" refer to a polynucleotide, polypeptide, or other 

chemical compound having a unique and defined position on a microarray. 
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"E- value" refers to the statistical probability that a match between two sequences occurred by 

chance. 

A "fragment" is a unique portion of mddt or MDDT which is identical in sequence to but 
shorter in length than the parent sequence. A fragment may comprise up to the entire length of the 
5 defined sequence, minus one nucleotide/amino acid residue. For example, a fragment may comprise 
from 10 to 1000 contiguous amino acid residues or nucleotides. A fragment used as a probe, primer, 
antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 15, 16, 20, 25, 30, 40, 50, 
60, 75, 100, 150, 250 or at least 500 contiguous amino acid residues or nucleotides in length. 
Fragments may be preferentially selected from certain regions of a molecule. For example, a 

10 polypeptide fragment may comprise a certain length of contiguous amino acids selected from the first 
250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain defined 
sequence. Clearly these lengths are exemplary, and any length that is supported by the specification, 
including the Sequence Listing and the figures, may be encompassed by the present embodiments. 

A fragment of mddt comprises a region of unique polynucleotide sequence that specifically 

15 identifies mddt, for example, as distinct from any other sequence in the same genome. A fragment of 
mddt is useful, for example, in hybridization and amplification technologies and in analogous 
methods that distinguish mddt from related polynucleotide sequences. The precise length of a 
fragment of mddt and the region of mddt to which the fragment corresponds are routinely 
determinable by one of ordinary skill in the art based on the intended purpose for the fragment. 

20 A fragment of MDDT is encoded by a fragment of mddt. A fragment of MDDT comprises a 

region of unique amino acid sequence that specifically identifies MDDT. For example, a fragment of 
MDDT is useful as an immunogenic peptide for the development of antibodies that specifically 
recognize MDDT. The precise length of a fragment of MDDT and the region of MDDT to which the 
fragment corresponds are routinely determinable by one of ordinary skill in the art based on the 

25 intended purpose for the fragment, 

A "full length" nucleotide sequence is one containing at least a start site for translation to a 
protein sequence, followed by an open reading frame and a stop site, and encoding a "full length" 
polypeptide. 

"Hit" refers to a sequence whose annotation will be used to describe a given template. 
30 Criteria for selecting the top hit are as follows: if the template has one or more exact nucleic acid 
matches, the top hit is the exact match with highest percent identity. If the template has no exact 
matches but has significant protein hits, the top hit is the protein hit with the lowest E-value. If the 
template has no significant protein hits, but does have significant non-exact nucleotide hits, the top hit 
is the nucleotide hit with the lowest E-value. 
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"Homology" refers to sequence similarity either between a reference nucleic acid sequence 
and at least a fragment of an mddt or between a reference amino acid sequence and a fragment of an 
MDDT, 

"Hybridization" refers to the process by which a strand of nucleotides anneals with a 
5 complementary strand through base pairing. Specific hybridization is an indication that two nucleic 
acid sequences share a high degree of identity. Specific hybridization complexes form under defined 
annealing conditions, and remain hybridized after the ^'washing" step. The defined hybridization 
conditions include the annealing conditions and the washing step(s), the latter of which is particularly 
important in determining the stringency of the hybridization process, with more stringent conditions 

10 allowing less non-specific binding, i.e., binding between pairs of nucleic acid probes that are not 
perfectly matched. Permissive conditions for annealing of nucleic acid sequences are routinely 
determinable and may be consistent among hybridization experiments, whereas wash conditions may 
be varied among experiments to achieve the desired stringency. 

Generally, stringency of hybridization is expressed with reference to the temperature under 

15 which the wash step is carried out. Generally, such wash temperatures are selected to be about 5^C to 
20^C lower than the thermal melting point (TJ for the specific sequence at a defined ionic strength 
and pH. The T^^ is the temperature (under defined ionic strength and pH) at which 50% of the target 
sequence hybridizes to a perfectly matched probe. An equation for calculating T^^ and conditions for 
nucleic acid hybridization is well known and can be found in Sambrook et aL, 1989, Molecular 

20 Cloning: A Laboratory Manual . 2"^* ed., vol. 1-3, Cold Spring Harbor Press, Plainview NY; 
specifically see volume 2, chapter 9. 

High stringency conditions for hybridization between polynucleotides of the present 
invention include wash conditions of 68*^C in the presence of about 0.2 x SSC and about 0.1 % SDS, 
for 1 hour. Alternatively, temperatures of about 65''C, 60'^C, or 55°C may be used. SSC 

2 5 concentration may be varied from about 0.2 to 2 x SSC, with SDS being present at about 0A%. 

Typically, blocking reagents are used to block non-specific hybridization. Such blocking reagents 
include, for instance, denatured salmon sperm DNA at about 100-200 jag/ml. Useful variations on 
these conditions will be readily apparent to those skilled in the art. Hybridization, particularly under 
high stringency conditions, may be suggestive of evolutionary similarity between the nucleotides. 
30 Such similarity is strongly indicative of a similar role for the nucleotides and their resultant proteins. 

Other parameters, such as temperature, salt concentration, and detergent concentration may 
be varied to achieve the desired stringency. Denaturants, such as formamide at a concentration of 
about 35-50% v/v, may also be used under particular circumstances, such as RNA:DNA 
hybridizations. Appropriate hybridization conditions are routinely determinable by one of ordinary 

3 5 skill in the art. 
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"Immunogenic" describes the potential for a natural, recombinant, or synthetic peptide, 
epitope, polypeptide, or protein to induce antibody production in appropriate animals, cells, or cell 
lines. 

"Insertion" or "addition" refers to a change in either a nucleic or amino acid sequence in 
5 which at least one nucleotide or residue, respectively, is added to the sequence. 

"Labeling" refers to the covalent or noncovalent joining of a polynucleotide, polypeptide, or 
antibody with a reporter molecule capable of producing a detectable or measurable signal. 

"Microarray" is any arrangement of nucleic acids, amino acids, antibodies, etc., on a 
substrate. The substrate may be a solid support such as beads, glass, paper, nitrocellulose, nylon, or 
10 an appropriate membrane. 

"Linkers" are short stretches of nucleotide sequence which may be added to a vector or an 
mddt to create restriction endonuclease sites to facilitate cloning. '*Polylinkers" are engineered to 
incorporate multiple restriction enzyme sites and to provide for the use of enzymes which leave 5' or 
3' overhangs (e.g., BamHI, EcoRI, and Hindlll) and those which provide blunt ends (e.g., EcoRV, 
15 SnaBI, and StuI). 

"Naturally occurring" refers to an endogenous polynucleotide or polypeptide that may be 
isolated from viruses or prokaryotic or eukaryotic cells. 

"Nucleic acid sequence" refers to the specific order of nucleotides joined by phosphodiester 
bonds in a linear, polymeric arrangement. Depending on the number of nucleotides, the nucleic acid 
20 sequence can be considered an oligomer, oligonucleotide, or polynucleotide. The nucleic acid can be 
DNA, RNA, or any nucleic acid analog, such as PNA, may be of genomic or synthetic origin, may be 
either double-stranded or single-stranded, and can represent either the sense or antisense 
(complementary) strand, 

"Oligomer" refers to a nucleic acid sequence of at least about 6 nucleotides and as many as 
25 about 60 nucleotides, preferably about 15 to 40 nucleotides, and most preferably between about 20 
and 30 nucleotides, that may be used in hybridization or amplification technologies. Oligomers may 
be used as, e.g., primers for PCR, and are usually chemically synthesized. 

"Operably linked" refers to the situation in which a first nucleic acid sequence is placed in a 
functional relationship with the second nucleic acid sequence. For instance, a promoter is operably 
30 linked to a coding sequence if the promoter affects the transcription or expression of the coding 

sequence. Generally, operably linked DNA sequences may be in close proximity or contiguous and, 
where necessary to join two protein coding regions, in the same reading frame. 

"Peptide nucleic acid" (PNA) refers to a DNA mimic in which nucleotide bases are attached 
to a pseudopeptide backbone to increase stability. PNAs, also designated antigene agents, can 
35 prevent gene expression by targeting complementary messenger RNA. 
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The phrases "percent identity" and "% identity", as applied to polynucleotide sequences, 
refer to the percentage of residue matches between at least two polynucleotide sequences aligned 
using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible 
way, gaps in the sequences being compared in order to optimize aligiiment between two sequences, 
5 and therefore achieve a more meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEG ALIGN version 3.12e 
sequence alignment program. This program is part of the LASERGENE software package, a suite of 
molecular biological analysis programs (DNASTAR, Madison Wl). CLUSTAL V is described in 

10 Higgins, D,G. and Sharp, P.M. (1989) CABIOS 5: 15M53 and in Higgins, D.G. et al. (1992) 

CABIOS 8:189-191. For pairwise alignments of polynucleotide sequences, the default parameters are 
set as follows: Ktuple=2, gap penalty=5, window=4, and ''diagonals saved"=:4. The ''weighted" 
residue weight table is selected as the default. Percent identity is reported by CLUSTAL V as the 
"percent similarity" between aligned polynucleotide sequence pairs. 

15 Alternatively, a suite of commonly used and freely available sequence comparison algorithms 

is provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment 
Search Tool (BLAST) (Altschul, S.F. et al. (1990) L MoL Biol. 215:403-410), which is available 
from several sources, including the NCBI, Bethesda, MD, and on the Internet at 
http://www.ncbi.nlm.nih.gov/BLAST/. The BLAST software suite includes various sequence 

2 0 analysis programs including '*blastn," that is used to determine alignment between a known 

polynucleotide sequence and other sequences on a variety of databases. Also available is a tool called 
"BLAST 2 Sequences" that is used for direct pairwise comparison of two nucleotide sequences. 
"BLAST 2 Sequences" can be accessed and used interactively at 

http://www.ncbi.nlm.nih.gov/gorf/bl2/. The "BLAST 2 Sequences" tool can be used for both biastn 
25 and blastp (discussed below). BLAST programs are commonly used with gap and other parameters 
set to default settings. For example, to compare two nucleotide sequences, one may use biastn with 
the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 1 999) set at default parameters. Such default 
parameters may be, for example: 
Matrix: BLOSUM62 
30 Reward for match: 1 

Penalty for mismatch: -2 
Open Gap: 5 and Extension Gap: 2 penalties 
Gap X drop-off: 50 
Expect: 10 
35 Word Size: 11 
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Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example, 
as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, 
over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
5 least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous 
nucleotides. Such lengths are exemplary only, and it is understood that any fragment length 
supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a 
length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
10 similar amino acid sequences due to the degenemcy of the genetic code. It is understood that changes 
in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
sequences that all encode substantially the same protein. 

The phrases "percent identity" and "% identity", as applied to polypeptide sequences, refer to 
the percentage of residue matches between at least two polypeptide sequences aligned using a 
15 standardized algorithm. Methods of polypeptide sequence alignment are well-known. Some 
alignment methods take into account conservative amino acid substitutions. Such conservative 
substitutions, explained in more detail above, generally preserve the hydrophobicity and acidity of the 
substituted residue, thus preserving the structure (and therefore function) of the folded polypeptide. 
Percent identity between polypeptide sequences may be determined using the default 
20 parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3. 12e 
sequence alignment program (described and referenced above). For pairwise alignments of 
polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap 
penalty=3, window=5, and "diagonals saved"=5. The PAM250 matrix is selected as the default 
residue weight table. As with polynucleotide alignments, the percent identity is reported by 
25 CLUSTAL V as the "percent similarity" between aligned polypeptide sequence pairs. 

Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 
comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences** tool Version 2.0.9 
(May-07-1999) with blastp set at default parameters. Such default parameters may be, for example: 
Matrix: BLOSUM62 
30 Open Gap: 11 and Extension Gap: 1 penalty 

Gap X drop-off: 50 
Expect: 10 
Word Size: 3 
Filter: on 
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Percent identity may be measured over the length of an entire defined polypeptide sequence, 
for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for 
example, over the length of a fragment taken from a larger, defined polypeptide sequence, for 
instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 
5 i50 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment 
length supported by the sequences shown herein, in figures or Sequence Listings, may be used to 
describe a length over which percentage identity may be measured. 

"Post-translational modification" of an MDDT may involve lipidation, glycosylation, 
phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in 

10 the art. These processes may occur synthetically or biochemically. Biochemical modifications will 
vary by cell type depending on the enzymatic milieu and the MDDT. 

"Probe" refers to mddt or fragments thereof, which are used to detect identical, allelic or 
related nucleic acid sequences. Probes are isolated oligonucleotides or polynucleotides attached to a 
detectable label or reporter molecule. Typical labels include radioactive isotopes, ligands, 

15 chemiluminescent agents, and enzymes. "Primers" are short nucleic acids, usually DNA 

oligonucleotides, which may be annealed to a target polynucleotide by complementary base-pairing. 
The primer may then be extended along the target DNA strand by a DNA polymerase enzyme. 
Primer pairs can be used for amplification (and identification) of a nucleic acid sequence, e.g., by the 
polymerase chain reaction (PGR). 

20 Probes and primers as used in the present invention typically comprise at least 15 contiguous 

nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers that comprise at least 20, 30, 40, 50, 60, 70, 80, 90, 100, or 
at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers may 
be considerably longer than these examples, and it is understood that any length supported by the 

25 specification, including the figures and Sequence Listing, may be used. 

Methods for preparing and using probes and primers are described in the references, for 
example Sambrook et aL, 1989, Molecular Cloning: A Laboratory Manual . 2""^ ed., vol. 1-3, Cold 
Spring Harbor Press, Plain view NY; Ausubel et aL,1987, Current Protocols in Molecular Biology . 
Greene PubL Assoc, & Wiley-Intersciences, New York NY; Innis et aL, 1990, PCR Protocols, A 

30 Guide to Methods and Applications ^ Academic Press, San Diego CA. PCR primer pairs can be 

derived from a known sequence, for example, by using computer programs intended for that purpose 
such as Primer (Version 0,5, 1991, Whitehead Institute for Biomedical Research, Cambridge MA). 

Oligonucleotides for use as primers are selected using software known in the art for such 
purpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 

35 100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 
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5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer 
selection programs have incorporated additional features for expanded capabilities. For example, the 
PrimOU primer selection program (available to the public from the Genome Center at University of 
Texas South West Medical Center, Dallas TX) is capable of choosing specific primers from 
5 megabase sequences and is thus useful for designing primers on a genome-wide scope. The PrimerS 
primer selection program (available to the public from the Whitehead Institute/MIT Center for 
Genome Research, Cambridge MA) allov^s the user to input a "mispriming library," in which 
sequences to avoid as primer binding sites are user-specified. PrimerS is useful, in particular, for the 
selection of oligonucleotides for microarrays. (The source code for the latter two primer selection 

10 programs may also be obtained from their respective sources and modified to meet the user's specific 
needs.) The PrimeGen program (available to the public from the UK Human Genome Mapping 
Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, 
thereby allowing selection of primers that hybridize to either the most conserved or least conserved 
regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both 

15 unique and conserved oligonucleotides and polynucleotide fragments. The oligonucleotides and 

polynucleotide fragments identified by any of the above selection methods are useful in hybridization 
technologies, for example, as PCR or sequencing primers, microarray elements, or specific probes to 
identify fully or partially complementary polynucleotides in a sample of nucleic acids. Methods of 
oligonucleotide selection are not limited to those described above. 

20 "Purified" refers to molecules, either polynucleotides or polypeptides that are isolated or 

separated from their natural environment and are at least 60% free, preferably at least 15% free, and 
most preferably at least 90% free from other compounds with which they are naturally associated. 

A "recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence 
that is made by an artificial combination of two or more otherwise separated segments of sequence. 

25 This artificial combination is often accomplished by chemical synthesis or, more commonly, by the 
artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques 
such as those described in Sambrook, supra . The term recombinant includes nucleic acids that have 
been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequently, a 
recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter 

30 sequence. Such a recombinant nucleic acid may be part of a vector that is used, for example, to 
transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a 
vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
expressed, inducing a protective immunological response in the mammal. 
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"Regulatory element" refers to a nucleic acid sequence from nontranslated regions of a gene, 
and includes enhancers, promoters, introns, and 3' untranslated regions, which interact with host 
proteins to carry out or regulate transcription or translation. 

"Reporter" molecules are chemical or biochemical moieties used for labeling a nucleic acid, 
an amino acid, or an antibody. They include radionuclides; enzymes; fluorescent, chemi luminescent, 
or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and other moieties known 
in the art. 

An "RNA equivalent," in reference to a DNA sequence, is composed of the same linear 
sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of the 
nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose 
instead of deoxyribose. 

"Sample" is used in its broadest sense. Samples may contain nucleic or amino acids, 
antibodies, or other materials, and may be derived from any source (e.g., bodily fluids including, but 
not limited to, saliva, blood, and urine; chromosome(s), organelles, or membranes isolated from a 
cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; and cleared cells or tissues or 
blots or imprints from such cells or tissues). 

"Specific binding" or '^specifically binding" refers to the interaction between a protein or 
peptide and its agonist, antibody, antagonist, or other binding partner. The interaction is dependent 
upon the presence of a particular structure of the protein, e.g., the antigenic determinant or epitope, 
recognized by the binding molecule. For example, if an antibody is specific for epitope "A," the 
presence of a polypeptide containing epitope A, or the presence of free unlabeled A, in a reaction 
containing free labeled A and the antibody will reduce the amount of labeled A that binds to the 
antibody. 

"Substitution" refers to the replacement of at least one nucleotide or amino acid by a different 
nucleotide or amino acid. 

"Substrate" refers to any suitable rigid or semi-rigid support including, e,g., membranes, 
filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
microparticles or capillaries. The substrate can have a variety of surface forms, such as wells, 
trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. 

A "transcript image" refers to the collective pattern of gene expression by a particular tissue 
or cell type under given conditions at a given time. 

"Transformation" refers to a process by which exogenous DNA enters a recipient cell. 
Transformation may occur under natural or artificial conditions using various methods well known in 
the art. Transformation may rely on any known method for the insertion of foreign nucleic acid 
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sequences into a prokaryotic or eukaryotic host cell. The method is selected based on the host ceil 
being transformed. 

"Transformants" include stably transformed cells in which the inserted DNA is capable of 
replication either as an autonomously replicating piasmid or as part of the host chromosome, as well 
5 as cells which transiently express inserted DNA or RNA, 

A "transgenic organism/' as used herein, is any organism, including but not limited to animals 
and plants, in which one or more of the cells of the organism contains heterologous nucleic acid 
introduced by way of human intervention, such as by transgenic techniques well known in the art. 
The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of 

10 the celU by way of deliberate genetic manipulation, such as by microinjection or by infection with a 
recombinant virus. The term genetic manipulation does not include classical cross-breeding, or in 
vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The 
transgenic organisms contemplated in accordance with the present invention include bacteria, 
cyanobacteria, fungi, and plants and animals. The isolated DNA of the present invention can be 

15 introduced into the host by methods known in the art, for example infection, transfection, 

transformation or transconjugation. Techniques for transferring the DNA of the present invention 
into such organisms are widely known and provided in references such as Sambrooket al. (1989), 
supra . 

A "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having 

2 0 at least 25% sequence identity to the particular nucleic acid sequence over a certain length of one of 

the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 30%, at 
least 50%, at least 60%» at least 70%, at least 80%, at least 90%, at least 95% or even at least 98% or 
greater sequence identity over a certain defined length. The variant may result in "conservative" 
25 amino acid changes which do not affect structural and/or chemical properties. A variant may be 
described as, for example, an "allelic" (as defined above), "splice," "species," or "polymorphic" 
variant. A splice variant nmy have significant identity to a reference molecule, but will generally 
have a greater or lesser number of polynucleotides due to alternate splicing of exons during mRNA 
processing. The corresponding polypeptide may possess additional functional domains or lack 

3 0 domains that are present in the reference molecule. Species variants are polynucleotide sequences 

that vary from one species to another. The resulting polypeptides generally will have significant 
amino acid identity relative to each other. A polymorphic variant is a variation in the polynucleotide 
sequence of a particular gene between individuals of a given species. Polymorphic variants also may 
encompass "single nucleotide polymorphisms" (SNPs) in which the polynucleotide sequence varies 
3 5 by one base. The presence of SNPs may be indicative of, for example, a certain population, a disease 
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state, or a propensity for a disease state. 

In an alternative, variants of the polynucleotides of the present invention may be generated 
through recombinant methods. One possible method is a DNA shuffling technique such as 
MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent Number 
5 5,837,458; Chang, C.-C. et al. (1999) Nat, Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat. 
BiotechnoL 17:259-264; and Crameri, A. et al. (1996) Nat Biotechnol. 14:315-319) to alter or 
improve the biological properties of MDDT, such as its biological or enzymatic activity or its ability 
to bind to other molecules or compounds. DNA shuffling is a process by which a library of gene 
variants is produced using PCR-mediated recombination of gene fragments. The library is then 

10 subjected to selection or screening procedures that identify those gene variants with the desired 

properties. These preferred variants may then be pooled and further subjected to recursive rounds of 
DNA shuffling and selection/screening. Thus, genetic diversity is created through "artificial'' 
breeding and rapid molecular evolution. For example, fragments of a single gene containing random 
point mutations may be recombined, screened, and then reshuffled until the desired properties are 

15 optimized. Altemativeiy, fragments of a given gene may be recombined with fragments of 
homologous genes in the same gene family, either from the same or different species, thereby 
maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable 
manner. 

A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having 
20 at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of 
the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2,0,9 (May-07- 
1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 98% or greater sequence 
identity over a certain defined length of one of the polypeptides. 

25 

THE INVENTION 

In a particular embodiment, cDNA sequences derived from human tissues and cell lines were 
aligned based on nucleotide sequence identity and assembled into "consensus" or "template" 
sequences which are designated by the template identification numbers (template IDs) in column 2 of 

30 Table 1 . The sequence identification numbers (SEQ ID NO:s) corresponding to the template IDs are 
shown in column 1 . The template sequences have similarity to GenBank sequences, or "hits," as 
designated by the GI Numbers in column 3. The statistical probability of each GenBank hit is 
indicated by a probability score in column 4, and the functional annotation corresponding to each 
GenBank hit is listed in column 5. 

35 The invention incorporates the nucleic acid sequences of these templates as disclosed in the 
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Sequence Listing and the use of these sequences in the diagnosis and treatment of disease states 
characterized by defects in molecules for disease detection and treatment. The invention further 
utilizes these sequences in hybridization and amplification technologies, and in particular, in 
technologies which assess gene expression patterns correlated with specific cells or tissues and their 
5 responses in vivo or in vitro to pharmaceutical agents, toxins, and other treatments. In this manner, 
the sequences of the present invention are used to develop a transcript image for a particular cell or 
tissue. 

Derivation of Nucleic Acid Sequences 

10 cDNA was isolated from libraries constructed using RNA derived from normal and diseased 

human tissues and cell lines. The human tissues and cell lines used for cDNA library construction 
were selected from a broad range of sources to provide a diverse population of cDNAs representative 
of gene transcription throughout the human body. Descriptions of the human tissues and cell lines 
used for cDNA library construction are provided in the LIFESEQ database (Incyte Genomics, Inc. 

15 (Incyte), Palo Alto CA). Human tissues were broadly selected from, for example, cardiovascular, 
dermatologic, endocrine, gastrointestinal, hematopoietic/immune system, musculoskeletal, neural, 
reproductive, and urologic sources. 

Cell lines used for cDNA library construction were derived from, for example, leukemic 
cells, teratocarcinomas, neuroepitheliomas, cervical carcinoma, lung fibroblasts, and endothelial cells. 

2 0 Such cell lines include, for example, THP-1 , Jurkat, HUVEC, hNT2, WI38, HeLa, and other ceil 

lines commonly used and available from public depositories (American Type Culture Collection, 
Manassas VA). Prior to mRNA isolation, cell lines were untreated, treated with a pharmaceutical 
agent such as 5 -aza-2 -deoxycytidine, treated with an activating agent such as lipopolysaccharide in 
the case of leukocytic cell lines, or, in the case of endothelial cell lines, subjected to shear stress. 

25 

Sequencing of the cDNAs 

Methods for DNA sequencing are well known in the art. Conventional enzymatic methods 
employ the Klenow fragment of DNA polymerase I, SEQUENASE DNA polymerase (U.S. 
Biochemical Corporation, Cleveland OH), Taq polymerase (PE Biosystems, Foster City CA), 

3 0 thermostable T7 polymerase (Amersham Pharmacia Biotech, Inc. (Amersham Pharmacia Biotech), 

Piscataway NJ), or combinations of polymerases and proofreading exonucleases such as those found 
in the ELONGASE amplification system (Life Technologies Inc. (Life Technologies), Gaithersburg 
MD), to extend the nucleic acid sequence from an oligonucleotide primer annealed to the DNA 
template of interest. Methods have been developed for the use of both single-stranded and double- 
35 stranded templates. Chain termination reaction products may be electrophoresed on urea- 
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polyacryiamide gels and detected either by autoradiography (for radioisotope-labeled nucleotides) or 
by fluorescence (for fluorophore-labeled nucleotides). Automated methods for mechanized reaction 
preparation, sequencing, and analysis using fluorescence detection methods have been developed. 
Machines used to prepare cDNAs for sequencing can include the MICROLAB 2200 liquid transfer 
5 system (Hamilton Company (Hamilton), Reno NV), Peltier thermal cycler (PTC200; MJ Research, 
Inc. (MJ Research), Watertown MA), and ABI CATALYST 800 thermal cycler (PE Biosystems). 
Sequencing can be carried out using, for example, the ABI 373 or 377 (PE Biosystems) or 
MEGABACE 1000 (Molecular Dynamics, Inc. (Molecular Dynamics), Sunnyvale CA) DNA 
sequencing systems, or other automated and manual sequencing systems well known in the art. 

10 The nucleotide sequences of the Sequence Listing have been prepared by current, state-of- 

the-art, automated methods and, as such, may contain occasional sequencing errors or unidentified 
nucleotides. Such unidentified nucleotides are designated by an N. These infrequent unidentified 
bases do not represent a hindrance to practicing the invention for those skilled in the art. Several 
methods employing standard recombinant techniques may be used to correct errors and complete the 

15 missing sequence information. (See, e,g., those described in Ausubel, P.M. et al. (1997) Short 

Protocols in Molecular Biology , John Wiley & Sons, New York NY; and Sambrook, J. et ah (1989) 
Molecular Cloning. A Laboratory Manual . Cold Spring Harbor Press, Plainview NY.) 

Assembly of cDNA Sequences 

20 Human polynucleotide sequences may be assembled using programs or algorithms well 

known in the art. Sequences to be assembled are related, wholly or in part, and may be derived from 
a single or many different transcripts. Assembly of the sequences can be performed using such 
programs as PHRAP (Phils Revised Assembly Program) and the GEL VIEW fragment assembly 
system (GCG), or other methods known in the art. 

25 Alternatively, cDNA sequences are used as "component" sequences that are assembled into 

"template** or "consensus" sequences as follows. Sequence chromatograms are processed, verified, 
and quality scores are obtained using PHRED. Raw sequences are edited using an editing pathway 
known as Block 1 (See, e.g., the LIFESEQ Assembled User Guide, Incyte Genomics, Palo Alto, CA). 
A series of BLAST comparisons is performed and low-information segments and repetitive elements 

30 (e.g., dinucleotide repeats, Alu repeats, etc.) are replaced by '*n's*\ or masked, to prevent spurious 
matches. Mitochondrial and ribosomal RNA sequences are also removed. The processed sequences 
are then loaded into a relational database management system (RDMS) which assigns edited 
sequences to existing templates, if available. When additional sequences are added into the RDMS, a 
process is initiated which modifies existing templates or creates new templates from works in 

35 progress (i.e., nonfinai assembled sequences) containing queued sequences or the sequences 
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themselves. After the new sequences have been assigned to templates, the templates can be merged 
into bins. If multiple templates exist in one bin, the bin can be split and the templates reannotated. 

Once gene bins have been generated based upon sequence alignments, bins are "clone joined" 
based upon clone information. Clone joining occurs when the 5' sequence of one clone is present in 
5 one bin and the 3* sequence from the same clone is present in a different bin, indicating that the two 
bins should be merged into a single bin. Only bins which share at least two different clones are 
merged. 

A resultant template sequence may contain either a partial or a full length open reading 
frame, or all or part of a genetic regulatory element. This variation is due in part to the fact that the 

10 full length cDNAs of many genes are several hundred, and sometimes several thousand, bases in 
length. With current technology, cDNAs comprising the coding regions of large genes cannot be 
cloned because of vector limitations, incomplete reverse transcription of the mRNA, or incomplete 
"second strand" synthesis. Template sequences may be extended to include additional contiguous 
sequences derived from the parent RNA transcript using a variety of methods known to those of skill 

15 in the art. Extension may thus be used to achieve the full length coding sequence of a gene. 

Analysis of the cDNA Sequences 

The cDNA sequences are analyzed using a variety of programs and algorithms which are well 
known in the art. (See, e.g., Ausubel, 1997, supra . Chapter 7.7; Meyers, R.A. (Ed.) (1995) Molecular 

20 Biology and Biotechnology , Wiley VCH, New York NY, pp. 856-853; and Table 5.) These analyses 
comprise both reading frame determinations, e.g., based on triplet codon periodicity for particular 
organisms (Fickett, J.W. (1982) Nucleic Acids Res. 10:5303-5318); analyses of potential start and 
stop codons; and homology searches. 

Computer programs known to those of skill in the art for performing computer-assisted 

25 searches for amino acid and nucleic acid sequence similarity, include, for example, Basic Local 

Alignment Search Tool (BLAST; Altschul, S.F. (1993) J. Mol. Evol. 36:290-300; Altschul, S.F. et al, 
(1990) J. Mol. Biol. 215:403-410). BLAST is especially useful in determining exact matches and 
comparing two sequence fragments of arbitrary but equal lengths, whose alignment is locally 
maximal and for which the alignment score meets or exceeds a threshold or cutoff score set by the 

30 user (Karlin, S, et al. (1988) Proc. Natl. Acad. Sci. USA 85:841-845). Using an appropriate search 
tool (e.g., BLAST or HMM), GenBank, S wissProt, BLOCKS, PFAM and other databases may be 
searched for sequences containing regions of homology to a query mddt or MDDT of the present 
invention. 

Other approaches to the identification, assembly, storage, and display of nucleotide and 
35 polypeptide sequences are provided in "Relational Database for Storing Biomolecule Information,** 
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U.S.S.N. 08/947,845, filed October 9, 1997; "Project-Based Full-Length Biomoiecuiar Sequence 
Database," U.S.S.N, 08/81 1J58, filed March 6, 1997; and ''Relational Database and System for 
Storing Information Relating to Biomoiecuiar Sequences," U.S.S.N. 09/034,807, filed March 4, 1998, 
all of which are incorporated by reference herein in their entirety. 

Protein hierarchies can be assigned to the putative encoded polypeptide based on, e.g., motif, 
BLAST, or biological analysis. Methods for assigning these hierarchies are described, for example, 
in "Database System Employing Protein Function Hierarchies for Viewing Biomoiecuiar Sequence 
Data," U.S.S.N. 08/812,290, filed March 6, 1997, incorporated herein by reference. 

Human Disease Detection and Treatment Molecule Sequences 

The mddt of the present invention may be used for a variety of diagnostic and therapeutic 
purposes. For example, an mddt may be used to diagnose a particular condition, disease, or disorder 
associated with disease detection and treatment molecules. Such conditions, diseases, and disorders 
include, but are not limited to, a cell proliferative disorder, such as actinic keratosis, arteriosclerosis, 
atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connecdve tissue disease (MCTD), myelofibrosis, 
paroxysmal nocmmal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and 
cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, 
teratocarcinoma, and, in particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, 
breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, 
pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and 
utems: and an autoimmune/inflammatory disorder, such as actinic keratosis, acquired 
immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, 
allergies, ankylosing spondylitis, amyloidosis, anemia, arteriosclerosis, asthma, atherosclerosis, 
autoimmune hemolytic anemia, autoimmune thyroiditis, bronchitis, bursitis, cholecystitis, cirrhosis, 
contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, 
emphysema, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomemlonephritis, 
Goodpasmre's syndrome, gout. Graves* disease, Hashimoto's thyroiditis, paroxysmal nocturnal 
hemoglobinuria, hepatitis, hypereosinophiiia, irritable bowel syndrome, episodic lymphopenia with 
lymphocytotoxins, mixed connective tissue disease (MCTD), multiple sclerosis, myasthenia gravis, 
myocardial or pericardial inflammation, myelofibrosis, osteoarthritis, osteoporosis, pancreatitis, 
polycythemia vera, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, 
Sjogren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, 
primary thrombocythemia, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, 
complications of cancer, hemodialysis, and extracorporeal circulation, trauma, and hematopoietic 
cancer including lymphoma, leukemia, and myeloma. The mddt can be used to detect the presence 
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of, or to quantify the amount of, an mddt-related polynucleotide in a sample. This information is then 
compared to information obtained from appropriate reference samples, and a diagnosis is established. 
Alternatively, a polynucleotide complementary to a given mddt can inhibit or inactivate a 
therapeutically relevant gene related to the mddt. 

5 

Analvsis of mddt Expression Patterns 

The expression of mddt may be routinely assessed by hybridization-based methods to 
determine, for example, the tissue-specificity, disease-specificity, or developmental stage-specificity 
of mddt expression. For example, the level of expression of mddt may be compared among different 

10 cell types or tissues, among diseased and normal cell types or tissues, among cell types or tissues at 
different developmental stages, or among cell types or tissues undergoing various treatments. This 
type of analysis is useful, for example, to assess the relative levels of mddt expression in fully or 
partially differentiated cells or tissues, to determine if changes in mddt expression levels are 
correlated with the development or progression of specific disease states, and to assess the response 

15 of a cell or tissue to a specific therapy, for example, in pharmacological or toxicological studies. 
Methods for the analysis of mddt expression are based on hybridization and amplification 
technologies and include membrane-based procedures such as northern blot analysis, high-throughput 
procedures that utilize, for example, microarrays, and PCR-based procedures. 

2 0 Hybridization and Genetic Analvsis 

The mddt, their fragments, or complementary sequences, may be used to identify the presence 
of and/or to determine the degree of similarity between two (or more) nucleic acid sequences. The 
mddt may be hybridized to naturally occurring or recombinant nucleic acid sequences under 
appropriately selected temperatures and salt concentrations. Hybridization with a probe based on the 

25 nucleic acid sequence of at least one of the mddt allows for the detection of nucleic acid sequences* 
including genomic sequences, which are identical or related to the mddt of the Sequence Listing. 
Probes may be selected from non-conserved or unique regions of at least one of the polynucleotides 
of SEQ ID NO: 1-14 and tested for their ability to identify or amplify the target nucleic acid sequence 
using standard protocols. 

30 Polynucleotide sequences that are capable of hybridizing, in particular, to those shown in 

SEQ ID NO: 1-14 and fragments thereof, can be identified using various condirions of stringency. 
(See, e.g., Wahl, G.M. and S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) 
Methods Enzymol. 152:507-51 1.) Hybridization conditions are discussed in "Definitions." 

A probe for use in Southern or northern hybridization may be derived from a fragment of an 

35 mddt sequence, or its complement, that is up to several hundred nucleotides in length and is either 
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single-stranded or double-stranded. Such probes may be hybridized in solution to biological materials 
such as plasmids, bacteriaU yeast, or human artificial chromosomes, cleared or sectioned tissues, or to 
artificial substrates containing mddt. Microarrays are particularly suitable for identifying the 
presence of and detecting the level of expression for multiple genes of interest by examining gene 
5 expression correlated with, e.g., various stages of development, treatment with a drug or compound, 
or disease progression. An array analogous to a dot or slot blot may be used to arrange and link 
polynucleotides to the surface of a substrate using one or more of the following: mechanical 
(vacuum), chemical, thermal, or UV bonding procedures. Such an array may contain any number of 
mddt and may be produced by hand or by using available devices, materials, and machines. 

10 Microarrays may be prepared, used, and analyzed using methods known in the art. (See, e.g., 

Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. 
USA 93:10614-10619; Baldeschweiler et al. (1995) PCX application W095/251 1 16; Shalon, D. et al. 
(1995) PCX application WO95/35505; Heller, R.A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150- 
2155; and Heller, M.J. et al. (1997) U.S. Patent No. 5,605,662.) 

15 Probes may be labeled by either PGR or enzymatic techniques using a variety of 

conrunercially available reporter molecules. For example, commercial kits are available for 
radioactive and chemiluminescent labeling (Amersham Pharmacia Biotech) and for alkaline 
phosphatase labeling (Life Xechnologies). Alternatively, mddt may be cloned into commercially 
available vectors for the production of RNA probes. Such probes may be transcribed in the presence 

20 of at least one labeled nucleotide (e.g., ^^P-AXP, Amersham Pharmacia Biotech). 

Additionally the polynucleotides of SEQ ID NO: 1-14 or suitable fragments thereof can be 
used to isolate full length cDNA sequences utilizing hybridization and/or amplification procedures 
well known in the art, e.g., cDNA library screening, PGR amplification, etc. The molecular cloning 
of such full length cDNA sequences may employ the method of cDNA library screening with probes 

25 using the hybridization, stringency, washing, and probing strategies described above and in Ausubel, 
supra > Chapters 3, 5, and 6. Xhese procedures may also be employed with genomic libraries to isolate 
genomic sequences of mddt in order to analyze, e.g., regulatory elements. 

Genetic Mapping 

30 Gene identification and mapping are important in the investigation and treatment of almost all 

conditions, diseases, and disorders. Cancer, cardiovascular disease, Alzheimer's disease, arthritis, 
diabetes, and mental illnesses are of particular interest. Each of these conditions is more complex 
than the single gene defects of sickle cell anemia or cystic fibrosis, with select groups of genes being 
predictive of predisposition for a particular condition, disease, or disorder. For example, 

3 5 cardiovascular disease may result from malfunctioning receptor molecules that fail to clear 
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cholesterol from the bloodstream, and diabetes may result when a particular individual's immune 
system is activated by an infection and attacks the insulin-producing cells of the pancreas. In some 
studies, Alzheimer^s disease has been linked to a gene on chromosome 21 ; other studies predict a 
different gene and location. Mapping of disease genes is a complex and reiterative process and 
5 generally proceeds from genetic linkage analysis to physical mapping. 

As a condition is noted among members of a family, a genetic linkage map traces parts of 
chromosomes that are inherited in the same pattern as the condition. Statistics link the inheritance of 
particular conditions to particular regions of chromosomes, as defined by RFLP or other markers. 
(See, for example. Lander, E. S. and Botstein, D. (1986) Proc. Natl, Acad. Sci. USA 83:7353-7357.) 

10 Occasionally, genetic markers and their locations are known from previous studies. More often, 
however, the markers are simply stretches of DNA that differ among individuals. Examples of 
genetic linkage maps can be found in various scientific journals or at the Online Mendelian 
Inheritance in Man (OMIM) World Wide Web site. 

In another embodiment of the invention, mddt sequences may be used to generate 

15 hybridization probes useful in chromosomal mapping of naturally occurring genomic sequences. 
Either coding or noncoding sequences of mddt may be used, and in some instances, noncoding 
sequences may be preferable over coding sequences. For example, conservation of an mddt coding 
sequence among members of a multi-gene family may potentially cause undesired cross hybridization 
during chromosomal mapping. The sequences may be mapped to a particular chromosome, to a 

2 0 specific region of a chromosome, or to artificial chromosome constructions, e,g., human artificial 

chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes 
(BACs), bacterial PI constructions, or single chromosome cDNA libraries. (See, e.g., Harrington, J.J. 
et al. (1997) Nat. Genet. 15:345-355; Price, CM. (1993) Blood Rev, 7:127-134; and Trask, B J. 
( 1 99 1 ) Trends Genet. 7 : 1 49- 1 54.) 
25 Fluorescent in situ hybridization (FISH) may be correlated with other physical chromosome 

mapping techniques and genetic map data. (See, e.g., Meyers, supra , pp. 965-968.) Correlation 
between the location of mddt on a physical chromosomal map and a specific disorder, or a 
predisposition to a specific disorder, may help define the region of DNA associated with that 
disorder. The mddt sequences may also be used to detect polymorphisms that are genetically linked 

3 0 to the inheritance of a pairticular condition, disease, or disorder. 

In situ hybridization of chromosomal preparations and genetic mapping techniques, such as 
linkage analysis using established chromosomal markers, may be used for extending existing genetic 
maps. Often the placement of a gene on the chromosome of another mammalian species, such as 
mouse, may reveal associated markers even if the number or arm of the corresponding human 
35 chromosome is not known. These new marker sequences can be mapped to human chromosomes and 
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may provide valuable information to investigators searching for disease genes using positional 
cloning or other gene discovery techniques. Once a disease or syndrome has been crudely correlated 
by genetic linkage with a particular genomic region, e.g., ataxia-telangiectasia to 1 lq22-23, any 
sequences mapping to that area may represent associated or regulatory genes for further investigation. 
5 (See, e.g., Gatti, R.A. et al. (1988) Nature 336:577-580.) The nucleotide sequences of the subject 
invention may also be used to detect differences in chromosomal architecture due to translocation, 
inversion, etc., among normal, carrier, or affected individuals. 

Once a disease-associated gene is mapped to a chromosomal region, the gene must be cloned 
in order to identify mutations or other alterations (e.g., translocations or inversions) that may be 

10 correlated with disease. This process requires a physical map of the chromosomal region containing 
the disease-gene of interest along with associated markers. A physical map is necessary for 
determining the nucleotide sequence of and order of marker genes on a particular chromosomal 
region. Physical mapping techniques are well known in the art and require the generation of 
overlapping sets of cloned DNA fragments from a particular organelle, chromosome, or genome. 

15 These clones are analyzed to reconstruct and catalog their order. Once the position of a marker is 

determined, the DNA from that region is obtained by consulting the catalog and selecting clones from 
that region. The gene of interest is located through positional cloning techniques using hybridization 
or similar methods. 

20 Diagnostic Uses 

The mddt of the present invention may be used to design probes useful in diagnostic assays. 
Such assays, well known to those skilled in the art, may be used to detect or confirm conditions, 
disorders, or diseases associated with abnormal levels of mddt expression. Labeled probes developed 
from mddt sequences are added to a sample under hybridizing conditions of desired stringency. In 

25 some instances, mddt, or fragments or oligonucleotides derived from mddt, may be used as primers in 
amplification steps prior to hybridization. The amount of hybridization complex formed is quantified 
and compared with standards for that cell or tissue. If mddt expression varies significantly from the 
standard, the assay indicates the presence of the condition, disorder, or disease. Qualitative or 
quantitative diagnostic methods may include northern, dot blot, or other membrane or dip-stick based 

3 0 technologies or multiple-sample format technologies such as PCR, enzyme-linked immunosorbent 
assay (ELISA)-like, pin, or chip-based assays. 

The probes described above may also be used to monitor the progress of conditions, 
disorders, or diseases associated with abnormal levels of mddt expression, or to evaluate the efficacy 
of a particular therapeutic treatment. The candidate probe may be identified from the mddt that are 

35 specific to a given human tissue and have not been observed in GenBank or other genome databases. 
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Such a probe may be used in animal studies, preclinical tests, clinical trials, or in monitoring the 
treatment of an individual patient. In a typical process, standard expression is established by methods 
well known in the art for use as a basis of comparison, samples from patients affected by the disorder 
or disease are combined with the probe to evaluate any deviation from the standard profile, and a 
5 therapeutic agent is administered and effects are monitored to generate a treatment profile. Efficacy 
is evaluated by determining whether the expression progresses toward or returns to the standard 
normal pattern. Treatment profiles may be generated over a period of several days or several months. 
Statistical methods well known to those skilled in the art may be use to determine the significance of 
such therapeutic agents, 

0 The polynucleotides are also useful for identifying individuals from minute biological 

samples, for example, by matching the RFLP pattern of a sample's DNA to that of an individual's 
DNA. The polynucleotides of the present invention can also be used to determine the actual 
base-by-base DNA sequence of selected portions of an individual's genome. These sequences can be 
used to prepare PCR primers for amplifying and isolating such selected DNA, which can then be 
5 sequenced. Using this technique, an individual can be identified through a unique set of DNA 

sequences. Once a unique ID database is established for an individual, positive identification of that 
individual can be made from extremely small tissue samples. 

In a panicular aspect, oligonucleotide primers derived from the mddt of the invention may be 
used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, insertions and 
0 deletions that are a frequent cause of inherited or acquired genetic disease in humans. Methods of 
SNP detection include, but are not limited to, single-stranded conformation polymorphism (SSCP) 
and fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from the 
polynucleotide sequences encoding MDDT are used to amplify DNA using the polymerase chain 
reaction (PCR). The DNA may be derived, for example, from diseased or normal tissue, biopsy 

25 samples, bodily fluids, and the like. SNPs in the DNA cause differences in the secondary and tertiary 
structures of PCR products in single-stranded form, and these differences are detectable using gel 
electrophoresis in non-denaturing gels. In fSCCP, the oligonucleotide primers are fluorescently 
labeled, which allows detection of the amplimers in high-throughput equipment such as DNA 
sequencing machines. Additionally, sequence database analysis methods, termed in silico SNP 

30 (isSNP), are capable of identifying polymorphisms by comparing the sequence of individual 

overlapping DNA fragments which assemble into common consensus sequences. These computer- 
based methods filter out sequence variations due to laboratory preparation of DNA and sequencing 
errors using statistical models and automated analyses of DNA sequence chromatograms. In the 
alternative, SNPs may be detected and characterized by mass spectrometry using, for example, the 

35 high throughput MASSARRAY system (Sequenom, Inc., San Diego CA). 
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DNA-based identification techniques are critical in forensic technology. DNA sequences 
taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, 
saliva, semen, etc., can be amplified using, e.g., PGR, to identify individuals. (See, e.g., Erlich, H. 
(1992) PGR Technology . Freeman and Co., New York, NY). Similarly, polynucleotides of the 
present invention can be used as polymorphic markers. 

There is also a need for reagents capable of identifying the source of a particular tissue. 
Appropriate reagents can comprise, for example, DNA probes or primers prepared from the 
sequences of the present invention that are specific for particular tissues. Panels of such reagents can 
identify tissue by species and/or by organ type. In a similar fashion, these reagents can be used to 
screen tissue cultures for contamination. 

The polynucleotides of the present invention can also be used as molecular weight markers on 
nucleic acid gels or Southern blots, as diagnostic probes for the presence of a specific mRNA in a 
particular cell type, in the creation of subtracted cDNA libraries which aid in the discovery of novel 
polynucleotides, in selection and synthesis of oligomers for attachment to an array or other support, 
and as an antigen to elicit an immune response. 

Disease Model Systems Usine mddt 

The mddt of the invention or their mammalian homologs may be "knocked out" in an animal 
model system using homologous recombination in embryonic stem (ES) cells. Such techniques are 
well known in the art and are useful for the generation of animal models of human disease. (See, e.g., 
U.S. Patent Number 5,175,383 and U.S. Patent Number 5,767,337.) For example, mouse ES cells, 
such as the mouse 129/SvJ cell line, are derived from the early mouse embryo and grown in culture. 
The ES cells are transformed with a vector containing the gene of interest disrupted by a marker gene, 
e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science 244:1288-1292). 
The vector integrates into the corresponding region of the host genome by homologous 
recombination. Alternatively, homologous recombination takes place using the Cre-loxP system to 
knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J.D. ( 1 996) 
Clin. Invest. 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330). 
Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from 
the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and 
the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous 
strains. Transgenic animals thus generated may be tested with potential therapeutic or toxic agents. 

The mddt of the invention may also be manipulated in vitro in ES cells derived from human 
blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 
lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate 
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into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J. A. et al. 
(1998) Science 282:1 145-1147). 

The mddt of the invention can also be used to create "knockin" humanized animals (pigs) or 
transgenic animals (mice or rats) to model human disease. With knockin technology, a region of 
5 mddt is injected into animal ES cells, and the injected sequence integrates into the animal cell 

genome. Transformed cells are injected into blastulae, and the blastulae are implanted as described 
above. Transgenic progeny or inbred lines are studied and treated with potential pharmaceutical 
agents to obtain information on treatment of a human disease. Alternatively, a mammal inbred to 
overexpress mddt, resulting, e.g., in the secretion of MDDT in its milk, may also serve as a 
10 convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74). 

Screening Assays 

MDDT encoded by polynucleotides of the present invention may be used to screen for 
molecules that bind to or are bound by the encoded polypeptides. The binding of the polypeptide and 

15 the molecule may activate (agonist), increase, inhibit (antagonist), or decrease activity of the 
polypeptide or the bound molecule. Examples of such molecules include antibodies, 
oligonucleotides, proteins (e.g., receptors), or small molecules. 

Preferably, the molecule is closely related to the natural iigand of the polypeptide, e.g., a 
ligand or fragment thereof, a natural substrate, or a structural or functional mimetic, (See, Coligan et 

20 al., (1991) Current Protocols in Immunologv 1(2): Chapter 5.) Similarly, the molecule can be closely 
related to the natural receptor to which the polypeptide binds, or to at least a fragment of the receptor, 
e.g., the active site. In either case, the molecule can be rationally designed using known techniques. 
Preferably, the screening for these molecules involves producing appropriate cells which express the 
polypeptide, either as a secreted protein or on the cell membrane. Preferred cells include cells from 

25 mammals, yeast, Drosophila , or E. coli . Cells expressing the polypeptide or ceil membrane fractions 
which contain the expressed polypeptide are then contacted with a test compound and binding, 
stimulation, or inhibition of activity of either the polypeptide or the molecule is analyzed. 

An assay may simply test binding of a candidate compound to the polypeptide, wherein 
binding is detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. 

30 Alternatively, the assay may assess binding in the presence of a labeled competitor. 

Additionally, the assay can be carried out using cell-free preparations, polypeptide/molecule 
affixed to a solid support, chemical libraries, or natural product mixnires. The assay may also simply 
comprise the steps of mixing a candidate compound with a solution containing a polypeptide, 
measuring polypeptide/molecule activity or binding, and comparing the polypeptide/molecule activity 

35 or binding to a standard. 
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Preferably, an ELISA assay using, e.g., a monoclonal or polyclonal antibody, can measure 
polypeptide level in a sample. The antibody can measure polypeptide level by either binding, directly 
or indirectly, to the polypeptide or by competing with the polypeptide for a substrate. 

All of the above assays can be used in a diagnostic or prognostic context. The molecules 
5 discovered using these assays can be used to treat disease or to bring about a particular result in a 
patient (e.g., blood vessel growth) by activating or inhibiting the polypeptide/molecule. Moreover, the 
assays can discover agents which may inhibit or enhance the production of the polypeptide from 
suitably manipulated cells or tissues. 

10 Transcript Imaging 

Another embodiment relates to the use of mddt to develop a transcript image of a tissue or 
cell type. A transcript image is the collective pattern of gene expression by a particular tissue or cell 
type under given conditions and at a given time. This pattern of gene expression is defined by the 
number of expressed genes, their abundance, and their function. Thus the mddt of the present 

15 invention may be used to develop a transcript image of a tissue or cell type by hybridizing, preferably 
in a microarray format, the mddt of the present invention to the totality of transcripts or reverse 
transcripts of a tissue or cell type. The resultant transcript image would provide a profile of gene 
activity pertaining to disease detection and treatment. 

Transcript images which profile mddt expression may be generated using transcripts isolated 

20 from tissues, cell lines, biopsies, or other biological samples. The transcript image may thus reflect 
mddt expression in vivo , as in the case of a tissue or biopsy sample, or in vitro , as in the case of a cell 
line. Transcript images may be used to profile mddt expression in distinct tissue types. This process 
can be used to determine disease detection and treatment molecule activity in a particular tissue type 
relative to this activity in a different tissue type. Transcript images may be used to generate a profile 

25 of mddt expression characteristic of diseased tissue. Transcript images of tissues before and after 

treatment may be used for diagnostic purposes, to monitor the progression of disease, and to monitor ' 
the efficacy of drug treatments for diseases which affect the activity of disease detection and 
treatment molecules. 

Transcript images which profile mddt expression may also be used in conjunction with in 
30 vitro model systems and preclinical evaluation of pharmaceuticals. Transcript images of cell lines 
can be used to assess disease detection and treatment molecule activity and/or to identify cell lines 
that lack or misregulate this activity. Such cell lines may then be treated with pharmaceutical agents, 
and a transcript image following treatment may indicate the efficacy of these agents in restoring 
desired levels of this activity. A similar approach may be used to assess the toxicity of 
35 pharmaceutical agents as reflected by undesirable changes in disease detection and treatment 
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molecule activity. Candidate pharmaceutical agents nnay be evaluated by comparing their associated 
transcript images with those of pharmaceutical agents of known effectiveness. 

Antiserise Molecules 

5 The polynucleotides of the present invention are useful in antisense technology. Antisense 

technology or therapy relies on the modulation of expression of a target protein through the specific 
binding of an antisense sequence to a target sequence encoding the target protein or directing its 
expression. (See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics , Humana Press Inc., Totawa 
NJ; Alama, A. et at. (1997) Pharmacol. Res. 36(3):17M78; Crooke, S.T. (1997) Adv. Pharmacol. 

10 40; 1-49; Sharma, H.W. and R. Narayanan (1995) Bioessays 17(12): 1055-1063; and Lavrosky, Y. et 
al. (1997) Biochem, Mol. Med. 62(1): 1 1-22.) An antisense sequence is a polynucleotide sequence 
capable of specifically hybridizing to at least a portion of the target sequence. Antisense sequences 
bind to cellular mRNA and/or genomic DNA, affecting translation and/or transcription. Antisense 
sequences can be DNA, RNA, or nucleic acid mimics and analogs. (See, e.g., Rossi, J.J. et al. (1991) 

15 Antisense Res. Dev. l(3):285-288; Lee, R. et al, (1998) Biochemistry 37(3):900-1010; Pardridge, 

W.M. et aL (1995) Proc. Natl. Acad. Sci. USA 92(12):5592-5596; and Nielsen, P. E. and Haaima, G. 
(1997) Chem. Soc. Rev. 96:73-78.) Typically, the binding which results in modulation of expression 
occurs through hybridization or binding of complementary base pairs. Antisense sequences can also 
bind to DNA duplexes through specific interactions in the major groove of the double helix. 

20 The polynucleotides of the present invention and fragments thereof can be used as antisense 

sequences to modify the expression of the polypeptide encoded by mddt. The antisense sequences 
can be produced ex vivQ > such as by using any of the ABI nucleic acid synthesizer series (PE 
Biosystems) or other automated systems known in the art. Antisense sequences can also be produced 
biologically, such as by transforming an appropriate host cell with an expression vector containing 

25 the sequence of interest. (See, e.g., Agrawal, supra .) 

In therapeutic use, any gene delivery system suitable for introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be delivered 
intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence 
complementary to at least a portion of the cellular sequence encoding the target protein. (See, e.g., 

30 Slater, J.E., et al. (1998) J. Allergy Clin. Immunol. 102(3):469-475; and Scanlon, K.J., et aL (1995) 
9(13): 1288-1296.) Antisense sequences can also be introduced intracellularly through the use of viral 
vectors, such as retrovirus and adeno-associated virus vectors. (See, e.g.. Miller, A.D. (1990) Blood 
76:271; Ausubel, F.M. etal. (1995) Current Protocols in Molecular Biology . John Wiley & Sons, 
New York NY; Uckert, W. and W. Walther ( 1 994) Phamnacol. Ther. 63(3):323-347.) Other gene 

35 delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other systems 
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known in the art. (See, e.g., Rossi, J.J. (1995) Br. Med. Bull. 5 1(1):2 17-225; Boado, RJ, et al. (1998) 
J. Pharm, Sci. 87(1 1): 1308-1315; and Morris, M.C. et al. (1997) Nucleic Acids Res. 25(14):2730- 
2736.) 

5 Expression 

In order to express a biologically active MDDT, the nucleotide sequences encoding MDDT or 
fragments thereof may be inserted into an appropriate expression vector, i.e., a vector which contains 
the necessary elements for transcriptional and translationai control of the inserted coding sequence in 
a suitable host. Methods which are well known to those skilled in the art may be used to construct 

10 expression vectors containing sequences encoding MDDT and appropriate transcriptional and 
translationai control elements. These methods include in vitro recombinant DNA techniques, 
synthetic techniques, and in vivo genetic recombination. (See, e.g., Sambrook, supra . Chapters 4, 8, 
16, and 17; and Ausubel, supra . Chapters 9, 10, 13, and 16.) 

A variety of expression vector/host systems may be utilized to contain and express sequences 

15 encoding MDDT. These include, but are not limited to, microorganisms such as bacteria transformed 
with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with 
yeast expression vectors; insect cell systems infected with viral expression vectors (e.g., baculovirus); 
plant cell systems transformed with viral expression vectors (e.g., cauliflower mosaic virus, CaMV, 
or tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or 

20 animal (mammalian) cell systems. (See, e.g., Sambrook, supra ; Ausubel, 1995, supra . Van Heeke, G. 
and S.M. Schuster (1989) J, Biol. Chem. 264:5503-5509; Bitter, G.A. et al. (1987) Methods Enzymol. 
153:516-544; Scorer, G.A. et al. (1994) Bio/Technology 12:181-184; Engelhard, E.K. et al. (1994) 
Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Then 7:1937-1945; 
Takamatsu, N. (1987) EMBO J. 6:307-31 1; Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, 

25 R. etal. (1984) Science 224:838-843; Winter, J. et al. (1991) Results ProbL Cell Differ. 17:85-105; 
The McGraw Hill Yearbook of Science and Technologv (1992) McGraw Hill, New York NY, pp. 
191-196; Logan, J. and T. Shenk(1984) Proc. Natl. Acad. Sci. USA 81:3655-3659; and Harrington, 
J.J. et al. (1997) Nat. Genet. 15:345-355.) Expression vectors derived from retroviruses, 
adenoviruses, or herpes or vaccinia viruses, or from various bacterial piasmids, may be used for 

30 delivery of nucleotide sequences to the targeted organ, tissue, or cell population. (See, e.g., Di 

Nicola, M. et al. (1998) Cancer Gen. Ther. 5(6):350-356; Yu, M. et al., (1993) Proc. Natl. Acad. Sci. 
USA 90(13):6340-6344; Buller, R.M. et al. (1985) Nature 3 17(6040):8 13-815; McGregor, D.P. et al. 
(1994) Mol. Immunol. 3 1(3):2 19-226; and Verma, I.M. and N. Somia ( 1997) Nature 389:239-242.) 
The invention is not limited by the host cell employed. 



32 



wo 00/75298 



For long term production of recombinant proteins in mammalian systems, stable expression 
of MDDT in cell lines is preferred. For example, sequences encoding MDDT can be transformed into 
cell lines using expression vectors which may contain viral origins of replication and/or endogenous 
expression elements and a selectable marker gene on the same or on a separate vector. Any number 
5 of selection systems may be used to recover transformed cell lines. (See, e.g., Wigler, M. et al. 

(1977) Cell 1 1:223-232; Lowy, 1. et al. (1980) Cell 22:817-823.; Wigler. M. et al. (1980) Proc. Natl. 
Acad. Sci, USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. Mol. Biol. 150:1-14; Hartman, 
S.C. and R.CMulligan (1988) Proc. Natl, Acad. Sci. USA 85:8047-8051; Rhodes, C.A. (1995) 
Methods Mol. Biol. 55:121-131.) 

10 

Therapeutic Uses of mddt 

The mddt of the invention may be used for somatic or germline gene therapy. Gene therapy 
may be performed to (i) correct a genetic deficiency (e.g., in the cases of severe combined 
immunodeficiency (SCID)-Xl disease characterized by X-linked inheritance (Cavazzana-Calvo, M. et 

15 al (2000) Science 288:669-672), severe combined immunodeficiency syndrome associated with an 
inherited adenosine deaminase (ADA) deficiency (Blaese, R.M. et al. (1995) Science 270:475-480; 
Bordignon, C. et al. (1995) Science 270:470-475), cystic fibrosis (Zabner, J. et al, (1993) Cell 75:207- 
216; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene 
Therapy 6:667-703), thalassemias, familial hypercholesterolemia, and hemophilia resulting from 

20 Factor VIII or Factor IX deficiencies (Crystal, R.G. (1995) Science 270:404-410; Verma, LM. and 
Somia, N, (1997) Nature 389:239-242)), (ii) express a conditionally lethal gene product (e.g., in the 
case of cancers which result from unregulated cell proliferation), or (iii) express a protein which 
affords protection against intracellular parasites (e.g., against human retroviruses, such as human 
immunodeficiency vims (HIV) (Baltimore, D, (1988) Nature 335:395-396; Poeschla, E. etal, (1996) 

25 Proc. Natl. Acad. Sci. USA. 93:1 1395-1 1399), hepatitis B or C virus (HBV, HCV); fungal parasites, 
such as Candida albicans and Paracoccidioides brasiliensis : and protozoan parasites such as 
Plasmodium falciparum and Trypanosoma cruzi ). In the case where a genetic deficiency in mddt 
expression or regulation causes disease, the expression of mddt from an appropriate population of 
transduced cells may alleviate the clinical manifestations caused by the genetic deficiency. 

30 In a further embodiment of the invention, diseases or disorders caused by deficiencies in 

mddt are treated by constmcting mammalian expression vectors comprising mddt and introducing 
these vectors by mechanical means into mddt-deficient cells. Mechanical transfer technologies for 
use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) 
ballistic gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene 

35 transfer, and (v) the use of DNA transposons (Morgan, R.A. and Anderson, W.F. (1993) Annu. Rev. 
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Biochem. 62:191-217; Ivies, Z. (1997) Cell 91:501-510; Boulay, J-L. and Recipon, H. (1998) Curr. 
Opin. BiotechnoL 9:445-450). 

Expression vectors that may be effective for the expression of mddt include, but are not 
limited to, the PGDN A 3.1, EPITAG, PRCCMV2, PREP, PVAX vectors (Invitrogen, Carlsbad CA), 
5 PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), and PTET-OFF, 

PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto CA). The mddt of the invention 
may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovims (CMV), 
Rous sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or p-actin genes), (ii) an inducible 
promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and Bujard, H. (1992) Proc. Natl. 

10 Acad. Sci. U.S.A. 89:5547-5551; Gossen, M. et al., (1995) Science 268:1766-1769; Rossi, F.M.V. 
and Blau, H.M. (1998) Curr. Opin. BiotechnoL 9:451-456), commercially available in the T-REX 
piasmid (Invitrogen)); the ecdysone-inducible promoter (available in the plasmids PVGRXR and 
PIND; Invitrogen); the FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible 
promoter (Rossi, F.M.V. and Blau, H.M. supra )), or (iii) a tissue-specific promoter or the native 

15 promoter of the endogenous gene encoding MDDT from a normal individual. 

Commercially available liposome transformation kits (e.g., the PERFECT LIPID 
TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 
parameters. In the alternative, transformation is performed using the calcium phosphate method 

20 (Graham, F.L. and Eb, A.J. (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. 
(1982) EMBO J. 1:841-845). The introduction of DNA to primary cells requires modification of 
these standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 
respect to mddt expression are treated by constructing a retrovirus vector consisting of (i) the mddt of 

25 the invention under the control of an independent promoter or the retrovirus long terminal repeat 
(LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive element (RRE) 
along with additional retrovims c/Vacting RNA sequences and coding sequences required for 
efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are commercially available 
(Stratagene) and are based on published data (Riviere, 1. et ah (1995) Proc. Natl. Acad. Sci. U.S.A. 

30 92:6733-6737), incorporated by reference herein. The vector is propagated in an appropriate vector 
producing cell line (VPCL) that expresses an envelope gene with a tropism for receptors on the target 
cells or a promiscuous envelope protein such as VSVg (Armentano, D. et al. (1987) J. Virol, 61 : 1647- 
1650; Bender, M.A. et al. (1987) J. Virol. 61:1639-1646; Adam, M.A. and Miller, A.D. (1988) J. 
Virol. 62:3802-3806; Dull, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. et al. (1998) J. Virol. 

35 72:9873-9880). U.S. Patent Number 5,910,434 to Rigg ("Method for obtaining retrovims packaging 
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cell lines producing high transducing efficiency retroviral supernatant") discloses a method for 
obtaining retrovirus packaging cell lines and is hereby incorporated by reference. Propagation of 
retrovirus vectors* transduction of a population of cells (e.g., CD4* T-cells), and the return of 
transduced cells to a patient are procedures well known to persons skilled in the art of gene therapy 
5 and have been well documented (Ranga, U. et al. (1997) J. Virol. 7 1:7020-7029; Bauer, G. et al. 
(1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 71:4707-4716; Ranga, U. et al. (1998) 
Free. Natl. Acad. Sci. U.S.A. 95:1201-1206; Su, L. (1997) Blood 89:2283-2290). 

In the alternative, an adenovirus-based gene therapy delivery system is used to deliver mddt 
to cells which have one or more genetic abnormalities with respect to the expression of mddt. The 

10 construction and packaging of adenovirus-based vectors are well known to those with ordinary skill in 
the art. Replication defective adenovims vectors have proven to be versatile for importing genes 
encoding immunoregulatory proteins into intact islets in the pancreas (Csete, M.E. et al. (1995) 
Transplantation 27:263-268). Potentially useful adenoviral vectors are described in U.S. Patent 
Number 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), hereby incorporated by 

15 reference. For adenoviral vectors, see also Antinozzi, RA. et al. (1 999) Annu. Rev. Nutr. 1 9:5 1 1-544 
and Verma, LM. and Somia, N. (1997) Nature 18:389:239-242, both incorporated by reference herein. 

In another alternative, a herpes-based, gene therapy delivery system is used to deliver mddt to 
target ceils which have one or more genetic abnormalities with respect to the expression of mddt. 
The use of herpes simplex virus (HSV)-based vectors may be especially valuable for introducing 

20 mddt to cells of the central nervous system, for which HSV has a tropism. The construction and 
packaging of herpes-based vectors are well known to those with ordinary skill in the art. A 
replication-competent herpes simplex virus (HSV) type 1 -based vector has been used to deliver a 
reporter gene to the eyes of primates (Liu, X, et al. (1999) Exp. Eye Res, 169:385-395). The 
construction of a HSV-1 vims vector has also been disclosed in detail in U.S. Patent Number 

25 5,804,413 to DeLuca ("Herpes simplex vims strains for gene transfer"), which is hereby incorporated 
by reference. U.S. Patent Number 5,804,413 teaches the use of recombinant HSV d92 which consists 
of a genome containing at least one exogenous gene to be transferred to a cell under the control of the 
appropriate promoter for purposes including human gene therapy. Also taught by this patent are the 
construction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. For HSV 

30 vectors, see also Coins, W. F. et al. 1999 J. Virol. 73:519-532 and Xu, H. et aL, (1994) Dev. Biol. 
163:152-161, hereby incorporated by reference. The manipulation of cloned herpesvims sequences, 
the generation of recombinant virus following the transfection of multiple plasmids containing 
different segments of the large herpesvirus genomes, the growth and propagation of herpesvims, and 
the infection of cells with herpesvims are techniques well known to those of ordinary skill in the art. 
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In another alternative, an alphavirus (positive, singie-stranded RNA virus) vector is used to 
deliver mddt to target cells. The biology of the prototypic alphavirus, Semliki Forest Virus (SFV), 
has been studied extensively and gene transfer vectors have been based on the SFV genome (Garoff, 
H. and Li, K-J. (1998) Curr. Opin. Biotech. 9:464-469), During alphavirus RNA replication, a 
subgenomic RNA is generated that normally encodes the viral capsid proteins. This subgenomic 
RNA replicates to higher levels than the full-length genomic RNA, resulting in the overproduction of 
capsid proteins relative to the viral proteins with enzymatic activity (e.g., protease and polymerase). 
Similarly, inserting mddt into the alphavirus genome in place of the capsid-coding region results in 
the production of a large number of mddt RNAs and the synthesis of high levels of MDDT in vector 
transduced cells. While alphavirus infection is typically associated w^ith cell lysis within a few days, 
the ability to establish a persistent infection in hamster normal kidney cells (BHK-21) with a variant 
of Sindbis virus (SIN) indicates that the lytic replication of alphaviruses can be altered to suit the 
needs of the gene therapy application (Dryga, S.A. et al. (1997) Virology 228:74-83). The wide host 
range of alphaviruses will allow the introduction of MDDT into a variety of cell types. The specific 
transduction of a subset of cells in a population may require the sorting of cells prior to transduction. 
The methods of manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA 
and RNA transfections, and performing alphavirus infections, are well known to those with ordinary 
skill in the art. 

Antibodies 

Anti-MDDT antibodies may be used to analyze protein expression levels. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, and Fab fragments. 
For descriptions of and protocols of antibody technologies, see, e.g.. Pound J,D. (1998) 
Immunochemical Protocols . Humana Press, Totowa, NJ, 

The amino acid sequence encoded by the mddt of the Sequence Listing may be analyzed by 
appropriate software (e.g., LASERGENE NAVIGATOR software, DNASTAR) to determine regions 
of high immunogenicity. The optimal sequences for immunization are selected from the C-terminus, 
the N-terminus, and those intervening, hydrophilic regions of the polypeptide which are likely to be 
exposed to the external environment when the polypeptide is in its natural conformation. Analysis 
used to select appropriate epitopes is also described by Ausubel (1997, supra . Chapter 1 1 .7). Peptides 
used for antibody induction do not need to have biological activity; however, they must be antigenic. 
Peptides used to induce specific antibodies may have an amino acid sequence consisting of at five 
amino acids, preferably at least 10 amino acids, and most preferably 15 amino acids. A peptide which 
mimics an antigenic fragment of the natural polypeptide may be fused with another protein such as 
keyhole limpet cyanin (KLH; Sigma, St. Louis MO) for antibody production. A peptide 
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encompassing an antigenic region may be expressed from an mddt, synthesized as described above, or 
purified from human cells. 

Procedures well known in the art may be used for the production of antibodies. Various hosts 
including mice, goats, and rabbits, may be immunized by injection with a peptide. Depending on the 

5 host species, various adjuvants may be used to increase immunological response. 

In one procedure, peptides about 15 residues in length may be synthesized using an ABI 
431 A peptide synthesizer (PE Biosystems) using fmoc-chemistry and coupled to KLH (Sigma) by 
reaction with M-maleimidobenzoyl-N-hydroxysuccinimide ester (Ausubel, 1995, su pra) . Rabbits are 
immunized with the peptide-KLH complex in complete Freund's adjuvant. The resulting antisera are 

10 tested for antipeptide activity by binding the peptide to plastic, blocking with 1 % bovine serum 
albumin (BSA), reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti- 
rabbit IgG. Antisera with antipeptide activity are tested for anti-MDDT activity using protocols well 
known in the art, including ELISA, radioimmunoassay (RIA), and immunoblotting. 

In another procedure, isolated and purified peptide may be used to immunize mice (about 100 

15 jig of peptide) or rabbits (about 1 mg of peptide). Subsequently, the peptide is radioiodinated and 
used to screen the inununized animals* B-Iymphocytes for production of antipeptide antibodies. 
Positive cells are then used to produce hybridomas using standard techniques. About 20 mg of 
peptide is sufficient for labeling and screening several thousand clones. Hybridomas of interest are 
detected by screening with radioiodinated peptide to identify those fusions producing peptide-specific 

20 monoclonal antibody. In a typical protocol, wells of a multi-well plate (FAST, Becton-Dickinson, 
Palo Alto, CA) are coated with affinity-purified, specific rabbit-anti-mouse (or suitable anti-species 
IgG) antibodies at 10 mg/ml. The coated wells are blocked with 1% BSA and washed and exposed to 
supematants from hybridomas. After incubation, the wells are exposed to radiolabeled peptide at 1 
mg/ml. 

25 Clones producing antibodies bind a quantity of labeled peptide that is detectable above 

background. Such clones are expanded and subjected to 2 cycles of cloning. Cloned hybridomas are 
injected into pristane-treated mice to produce ascites, and monoclonal antibody is purified from the 
ascitic fluid by affinity chromatography on protein A (Amersham Pharmacia Biotech). Several 
procedures for the production of monoclonal antibodies, including in vitro production, are described 

3 0 in Pound ( supra ). Monoclonal antibodies with antipeptide activity are tested for anti-MDDT activity 
using protocols well known in the art, including ELISA, RIA, and immunoblotting. 

Antibody fragments containing specific binding sites for an epitope may also be generated. 
For example, such fragments include, but are not limited to, the F(ab*)2 fragments produced by pepsin 
digestion of the antibody molecule, and the Fab fragments generated by reducing the disulfide bridges 

35 of the F(ab')2 fragments. Alternatively, constmction of Fab expression libraries in filamentous 
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bacteriophage allows rapid and easy identification of monoclonal fragments with desired specificity 
(Pound, supra . Chaps. 45-47). Antibodies generated against polypeptide encoded by mddt can be used 
to purify and characterize full-length MDDT protein and its activity, binding partners, etc. 

5 Assays Using Antibodies 

Anti-MDDT antibodies may be used in assays to quantify the amount of MDDT found in a 
particular human cell. Such assays include methods utilizing the antibody and a label to detect 
expression level under normal or disease conditions. The peptides and antibodies of the invention 
may be used with or without modification or labeled by joining them, either covaiently or 

10 noncovalentiy, with a reporter molecule. 

Protocols for detecting and measuring protein expression using either polyclonal or 
monoclonal antibodies are well known in the art. Examples include ELISA, RIA, and fluorescent 
activated cell sorting (FACS). Such immunoassays typically involve the formation of complexes 
between the MDDT and its specific antibody and the measurement of such complexes. These and 

15 other assays are described in Pound ( supra ). 

Without further elaboration, it is believed that one skilled in the art can, using the preceding 
description, utilize the present invention to its fullest extent. The following preferred specific 
embodiments are, therefore, to be constmed as merely illustrative, and not limitative of the remainder 
of the disclosure in any way whatsoever. 

20 The disclosures of all patents, applications, and publications mentioned above and below, in 

particular U.S. Provisional Application No. 60/137,412, filed June 3, 1999, U.S. Provisional 
Application No. 60/147,542, filed August 5, 1999, U.S. Provisional Application No. 60/147,501, filed 
August 5, 1999, U.S. Provisional Application No. 60/147,500, filed August 5, 1999 are hereby 
expressly incorporated by reference. 

25 

EXAMPLES 

L Construction of cDNA Libraries 

RNA was purchased from CLONTECH Laboratories, Inc. (Palo Alto CA) or isolated from 
various tissues. Some tissues were homogenized and lysed in guanidinium isothiocyanate, while 
30 others were homogenized and lysed in phenol or in a suitable mixture of denaturants, such as 

TRIZOL (Life Technologies), a monophasic solution of phenol and guanidine isothiocyanate. The 
resulting lysates were centrifuged over CsCl cushions or extracted with chloroform. RNA was 
precipitated with either isopropanol or sodium acetate and ethanol, or by other routine methods. 
Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA 
35 purity. In most cases, RNA was treated with DNase. For most libraries* poly(A+) RNA was isolated 
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using oligo d(T)-coupled paramagnetic particles (Promega Corporation (Promega), Madison WI), 
OLIGOTEX latex particles (QIAGEN, Inc. (QIAGEN), Valencia CA), or an OLIGOTEX mRNA 
purification kit (QIAGEN). Alternatively, RNA was isolated directly from dssue lysates using other 
RNA isoladon kits, e.g., the POLY(A)PURE mRNA purificanon kit (Ambion, Inc., Austin TX). 

In some cases, Stratagene was provided with RNA and constructed the conesponding cDNA 
libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP 
vector system (Stratagene Cloning Systems, Inc. (Stratagene). La Jolla CA) or SUPERSCRIPT 
plasmid system (Life Technologies), using the reconmiended procedures or similar methods known in 
the art. (See, e.g., Ausubel, 1997, supra . Chapters 5.1 through 6.6.) Reverse transcription was 
initiated using oligo d(T) or random primers. Synthetic oligonucleotide adapters were ligated to 
double stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or 
enzymes. For most libraries, the cDNA was size-selected (300-1000 bp) using SEPHACRYL SIOOO, 
SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (Amersham Pharmacia 
Biotech) or preparative agarose gel electrophoresis. cDNAs were ligated into compatible restriction 
enzyme sites of the polylinker of a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), 
pSPORTl plasmid (Life Technologies), orpINCY (Incyte). Recombinant plasmids were transformed 
into competent E. coli cells including XLl-Blue, XLl-BlueMRF, or SOLR from Stratagene or DH5a, 
DHIOB, or ElectroMAX DHIOB from Life Technologies. 

XL Isolation of cDNA Clones 

Plasmids were recovered from host cells by in vivo excision using the UNIZAP vector system 
(Stratagene) or by cell lysis. Plasmids were purified using at least one of the following: the Magic or 
WIZARD Minipreps DNA purification system (Promega); the AGTC Miniprep purification kit (Edge 
BioSystems, Gaithersburg MD); and the QIAWELL 8, QIAWELL 8 Plus, and QIAWELL 8 Ultra 
plasmid purification systems or the R.E. A.L. PREP 96 plasmid purification kit (QIAGEN). 
Following precipitation, plasmids were resuspended in 0.1 ml of distilled water and stored, with or 
without lyophilization, at 4°C- 

Altematively, plasmid DNA was amplified from host cell lysates using direct link PCR in a 
high-throughput format. (Rao, V,B. (1994) Anal. Biochem. 216:1-14.) Host cell lysis and thermal 
cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 
384-weIl plates, and the concentration of amplified plasmid DNA was quantified fluorometrically 
using PICOGREEN dye (Molecular Probes, Inc. (Molecular Probes), Eugene OR) and a 
FLUOROSKAN II fluorescence scanner (Labsystems Oy, Helsinki, Finland). 
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III. Sequencing and Analysis 

cDNA sequencing reactions were processed using standard methods or high-throughput 
instrumentation such as the ABI CATALYST 800 thermal cycler (PE Biosystems) or the PTC-200 
thermal cycler (MJ Research) in conjunction with the HYDRA microdispenser(Robbins Scientific 
5 Corp., Sunnyvale CA) or the MICROLAB 2200 liquid transfer system (Hamilton), cDNA sequencing 
reactions were prepared using reagents provided by Amersham Pharmacia Biotech or supplied in ABI 
sequencing kits such as the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (PE 
Biosystems). Electrophoretic separation of cDNA sequencing reactions and detection of labeled 
polynucleotides were canied out using the MEGABACE 1000 DNA sequencing system (Molecular 
10 Dynamics); the ABI PRISM 373 or 377 sequencing system (PE Biosystems) in conjunction with 
standard ABI protocols and base calling software; or other sequence analysis systems known in the 
art. Reading frames within the cDNA sequences were identified using standard methods (reviewed in 
Ausubel, 1997, supra. Chapter 7.7). Some of the cDNA sequences were selected for extension using 
the techniques disclosed in Example Vni. 

15 

IV. Assembly and Analysis of Sequences 

Component sequences from chromatograms were subject to PHRED analysis and assigned a 
quality score. The sequences having at least a required quality score were subject to various pre- 
processing editing pathways to eliminate, e.g., low quality 3' ends, vector and linker sequences, polyA 
20 tails, Alu repeats, mitochondrial and ribosomal sequences, bacterial contamination sequences, and 
sequences smaller than 50 base pairs. In particular, low-information sequences and repetitive 
elements (e.g., dinucleotide repeats, Alu repeats, etc.) were replaced by "n's", or masked, to prevent 
spurious matches. 



25 assigned to gene bins (bins). Each sequence could only belong to one bin. Sequences in each gene 
bin were assembled to produce consensus sequences (templates). Subsequent new sequences were 
added to existing bins using BLASTn (v, 1.4 WashU) and CROSSMATCH. Candidate pairs were 
identified as all BLAST hits having a quality score greater than or equal to 150. Alignments of at 
least 82% local identity were accepted into the bin. The component sequences from each bin were 

3 0 assembled using a version of PHRAP. Bins with several overlapping component sequences were 

assembled using DEEP PHRAP. The orientation (sense or antisense) of each assembled template was 
determined based on the number and orientation of its component sequences. Template sequences as 
disclosed in the sequence listing correspond to sense strand sequences (the "forward" reading 
frames), to the best determination. The complementary (antisense) strands are inherently disclosed 



Processed sequences were then subject to assembly procedures in which the sequences were 
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herein. The component sequences which were used to assemble each template consensus sequence 
are listed in Table 4, along with their positions along the template nucleotide sequences. 

Bins were compared against each other and those having local similarity of at least 82% were 
combined and reassembled. Reassembled bins having templates of insufficient overlap (less than 
5 95% local identity) were re-split. Assembled templates were also subject to analysis by 

STnCHER/EXON MAPPER algorithms which analyze the probabilities of the presence of splice 
variants, alternatively spliced exons, splice junctions, differential expression of altemative spliced 
genes across tissue types or disease states, etc. These resulting bins were subject to several rounds of 
the above assembly procedures. 

10 Once gene bins were generated based upon sequence alignments, bins were clone joined 

based upon clone information. If the 5' sequence of one clone was present in one bin and the 3' 
sequence from the same clone was present in a different bin, it was likely that the two bins actually 
belonged together in a single bin. The resulting combined bins underwent assembly procedures to 
regenerate the consensus sequences. 

15 The final assembled templates were subsequently annotated using the following procedure. 

Template sequences were analyzed using BLASTn (v2.0, NCBI) versus gbpri (GenBank version 1 16). 
"Hits" were defined as an exact match having from 95% local identity over 200 base pairs through 
100% local identity over 100 base pairs, or a homolog match having an E-value, i.e. a probability 
score, of < 1 X 10"^. The hits were subject to frameshift FASTx versus GENPEPT (GenBank version 

20 1 16). (See Table 5). In this analysis, a homolog match was defined as having an E-value of < 1 x 10' 
The assembly method used above was described in "System and Methods for Analyzing 
Biomolecuiar Sequences," U.S.S.N. 09/276,534, filed March 25, 1999, and the LIFESEQ Gold user 
manual (Incyte) both incorporated by reference herein. 

Following assembly, template sequences were subjected to motif, BLAST, and functional 

25 analyses, and categorized in protein hierarchies using methods described in, e,g., "Database System 
Employing Protein Function Hierarchies for Viewing Biomolecuiar Sequence Data," U.S. S.N. 
08/812,290, filed March 6, 1997; "Relational Database for Storing Biomoiecule Information," 
U.S. S.N. 08/947,845, filed October 9, 1997; "Project-Based Full-Length Biomolecuiar Sequence 
Database," U.S.S.N. 08/8 11, 758, filed March 6, 1997; and "Relational Database and System for 

30 Storing Information Relating to Biomolecuiar Sequences," U.S.S,N, 09/034,807, filed March 4, 1998, 
all of which are incorporated by reference herein. 

The template sequences were further analyzed by translating each template in all three 
forward reading frames and searching each translation against the Pfam database of hidden Markov 
model-based protein families and domains using the HMMER software package (available to the 

35 public from Washington University School of Medicine, St. Louis MO). Regions of templates which. 
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when translated, contain similarity to Pfam consensus sequences are reported in Table 2, along with 
descriptions of Pfam protein domains and families. Only those Pfam hits with an E- value of ^ 1 x 10"^ 
are reported. (See also World Wide Web site http://pfam.wustl.edu/ for detailed descriptions of Pfam 
protein domains and families.) 
5 Additionally, the template sequences were translated in all three forward reading frames, and 

each translation was searched against hidden Markov models for signal peptide and transmembrane 
domains using the HMMER software package. Construction of hidden Markov models and their 
usage in sequence analysis has been described. (See, for example, Eddy, S.R. (1996) Curr. Opin. Str. 
Biol. 6:361-365.) Regions of templates which, when translated, contain similarity to signal peptide or 

10 transmembrane domain consensus sequences are reported in Table 3. Only those signal peptide or 
transmembrane hits with a cutoff score of 1 1 bits or greater are reported. A cutoff score of 1 1 bits or 
greater corresponds to at least about 91-94% true-positives in signal peptide prediction, and at least 
about 75% true-positives in transmembrane domain prediction. 

The results of HMMER analysis as reported in Tables 2 and 3 may support the results of 

15 BLAST analysis as reported in Table 1 or may suggest alternative or additional properties of 
template-encoded polypeptides not previously uncovered by BLAST or other analyses. 

Template sequences are further analyzed using the bioinformatics tools listed in Table 5, or 
using sequence analysis software known in the art such as MACDNASIS PRO software (Hitachi 
Software Engineering, South San Francisco CA) and LASERGENE software (DNASTAR). 

20 Template sequences may be further queried against public databases such as the GenBank rodent, 
mammalian, vertebrate, prokaryote, and eukaryote databases. 

V. Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a 
2 5 gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
from a particular cell type or tissue have been bound. (See, e.g,, Sambrook, supra , ch, 7; Ausubel, 
1995, supra , ch. 4 and 16.) 

Analogous computer techniques applying BLAST were used to search for identical or related 
molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Pharmaceuticals). This analysis 
30 is much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the 

computer search can be modified to determine whether any particular match is categorized as exact or 
similar. The basis of the search is the product score, which is defined as: 

BLAST Score x Percent Identitv 

35 5 X minimum {length(Seq. 1), length(Seq. 2)} 
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The product score takes into account both the degree of similarity between two sequences and the 
length of the sequence match. The product score is a normalized value between 0 and 100, and is 
calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the 
product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is 
5 calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair 
(HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by 
gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate 
the product score. The product score represents a balance between fractional overlap and quality in a 
BLAST alignment. For example, a product score of 100 is produced only for 100% identity over the 
10 entire length of the shorter of the two sequences being compared. A product score of 70 is produced 
either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the 
other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% 
identity and 100% overlap. 

15 VI. Tissue Distribution Profiling 

A tissue distribution profile is determined for each template by compiling the cDNA library 
tissue classifications of its component cDNA sequences. Each component sequence, is derived from 
a cDNA library constructed from a human tissue. Each human tissue is classified into one of the 
following categories: cardiovascular system; connective tissue; digestive system; embryonic 
20 structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ ceils; hemic 
and immune system; liver; musculoskeletal system; nervous system; pancreas; respiratory system; 
sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract. Template sequences, 
component sequences, and cDNA library/tissue information are found in the LIFESEQ GOLD 
database (Incyte Genomics, Palo Alto CA). 

25 

VII. Transcript Image Analysis 

Transcript images are generated as described in Seilhamer et al., ^^Comparative Gene 
Transcript Analysis," U.S. Patent Number 5,840,484, incorporated herein by reference. 

30 VIIL Extension of Polynucleotide Sequences and Isolation of a Full-length cDNA 

Oligonucleotide primers designed using an mddt of the Sequence Listing are used to extend 
the nucleic acid sequence. One primer is synthesized to initiate 5' extension of the template, and the 
other primer, to initiate 3' extension of the template. The initial primers may be designed using 
OLIGO 4.06 software (National Biosciences, Inc. (National Biosciences), Plymouth MN), or another 
35 appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or 
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more, and to anneal to the target sequence at temperatures of about 68 ''C to about 72 °C. Any stretch 
of nucleotides which would result in hairpin structures and primer-primer dimerizaiions are avoided. 
Selected human cDNA libraries are used to extend the sequence. If more than one extension is 
necessary or desired, additional or nested sets of primers are designed. 
5 High fidelity amplification is obtained by PCR using methods well known in the art, PGR is 

performed in 96-well plates using the PTC-200 thermal cycler (MJ Research). The reaction mix 
contains DNA template, 200 nmol of each primer, reaction buffer containing Mg~^, (NH4)2S04, and 6- 
mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASE enzyme (Life 
Technologies), and Pfu DNA polymerase (Stratagene), with the following parameters for primer pair 

10 PCI A and PCI B: Step 1: 94"C, 3 min; Step 2: 94^C, 15 sec; Step 3: 60^C, 1 min; Step 4: 68°C, 2 
min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68''C, 5 min; Step 7: storage at 4°C. In the 
alternative, the parameters for primer pair T7 and SK+ are as follows: Step 1: 94 ""C, 3 min; Step 2: 
94°C, 15 sec; Step 3: 57°C, 1 min; Step 4: 68°C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; 
Step 6: 68 °C, 5 min; Step 7: storage at 4°C. 

15 The concentration of DNA in each well is determined by dispensing 100 |al PICOGREEN 

quantitation reagent (0.25% (v/v); Molecular Probes) dissolved in IX Tris-EDTA (TE) and 0.5 pil of 
undiluted PCR product into each well of an opaque fluorimeter plate (Coming Incorporated 
(Coming), Coming NY), allowing the DNA to bind to the reagent. The plate is scanned in a 
FLUOROSKAN II (Labsy stems Oy) to measure the fluorescence of the sample and to quantify the 

20 concentration of DNA. A 5 ^1 to 10 )al aliquot of the reaction mixture is analyzed by electrophoresis 
on a 1 % agarose mini-gel to determine which reactions are successful in extending the sequence. 

The extended nucleotides are desalted and concentrated, transferred to 384-well plates, 
digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 
sonicated or sheared prior to religation into pUC 18 vector (Amersham Pharmacia Biotech). For 

25 shotgun sequencing, the digested nucleotides are separated on low concentration (0.6 to 0.8%) 

agarose gels, fragments are excised, and agar digested with AGAR ACE (Promega). Extended clones* 
are religated using T4 ligase (New England Biolabs, Inc., Beverly MA) into pUC 18 vector 
(Amersham Pharmacia Biotech), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction 
site overhangs, and transfected into competent E. coli cells. Transformed cells are selected on 

30 antibiotic-containing media, individual colonies are picked and cultured overnight at 37 °C in 384- 
well plates in LB/2x carbenicillin liquid media. 

The cells are lysed, and DNA is amplified by PCR using Taq DNA polymerase (Amersham 
Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1: 
94 ""C, 3 min; Step 2: 94^C, 15 sec; Step 3: 60**C, 1 min; Step 4: 72°C, 2 min; Step 5: steps 2, 3, and 4 

35 repeated 29 times; Step 6: 72^C, 5 min; Step 7: storage at 4**C. DNA is quantified by PICOGREEN 
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reagent (Molecular Probes) as described above. Samples with low DNA recoveries are reampiified 
using the same conditions as described above. Samples are diluted with 20% dimethysulfoxide (1 :2, 
v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC 
DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cycle 
sequencing ready reacdon kit (PE Biosy stems). 

In like manner, the mddt is used to obtain regulatory sequences (promoters, introns, and 
enhancers) using the procedure above, oligonucleotides designed for such extension, and an 
appropriate genomic library. 

IX^ Labeling of Probes and Southern Hybridization Analyses 

Hybridization probes derived from the mddt of the Sequence Listing are employed for 
screening cDNAs, mRNAs, or genomic DNA. The labeling of probe nucleotides between 100 and 
1000 nucleotides in length is specifically described, but essentially the same procedure may be used 
with larger cDNA fragments. Probe sequences are labeled at room temperature for 30 minutes using 
aT4 polynucleotide kinase, y^-P-ATP, and 0.5X One-Phor-All Plus (Amersham Pharmacia Biotech) 
buffer and purified using a ProbeQuant G-50 Microcolumn (Amersham Pharmacia Biotech). The 
probe mixture is diluted to 10^ dpm/fig/ml hybridization buffer and used in a typical membrane-based 
hybridization analysis. 

The DNA is digested with a restriction endonuclease such as Eco RV and is electrophoresed 
through a 0.7% agarose gel. The DNA fragments are transferred from the agarose to nylon membrane 
(NYTRAN Plus, Schleicher & Schuell, Inc., Keene NH) using procedures specified by the 
manufacturer of the membrane. Prehybridization is carried out for three or more hours at 68 °C, and 
hybridization is carried out overnight at 68 °C. To remove non-specific signals, blots are sequendally 
washed at room temperature under increasingly stringent conditions, up to 0.1 x saline sodium citrate 
(SSC) and 0.5% sodium dodecyl sulfate. After the blots are placed in a PHOSPHORIMAGER 
cassette (Molecular Dynamics) or are exposed to autoradiography film, hybridization patterns of 
standard and experimental lanes are compared. Essentially the same procedure is employed when 
screening RNA. 

X. Chromosome Mapping of mddt 

The cDNA sequences which were used to assemble SEQ ID NO: 1-14 are compared with 
sequences from the Incyte LIFESEQ database and public domain databases using BLAST and other 
implementations of the Smith- Waterman algorithm. Sequences from these databases that match SEQ 
ID NO: 1-14 are assembled into clusters of contiguous and overlapping sequences using assembly 
algorithms such as PHRAP (Table 5). Radiation hybrid and genetic mapping data available from 
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public resources such as the Stanford Human Genome Center (SHGC), Whitehead Institute for 
Genome Research (WIGR), and Genethon are used to detennine if any of the clustered sequences 
have been previously mapped. Inclusion of a mapped sequence in a cluster will result in the 
assignment of all sequences of that cluster, including its particular SEQ ID NO:, to that map location. 
5 The genetic map locations of SEQ ID NO: 1 - 14 are described as ranges, or intervals, of human 

chromosomes. The map position of an interval, in centiMorgans, is measured relative to the terminus 
of the chromosome's p-arm. (The centiMorgan (cM) is a unit of measurement based on recombination 
frequencies between chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase 
(Mb) of DNA in humans, although this can vary widely due to hot and cold spots of recombination.) 
10 The cM distances are based on genetic markers mapped by Genethon which provide boundaries for 
radiation hybrid markers whose sequences were included in each of the clusters. 

XI. Microarray Analysis 

Probe Preparation from Tissue or Cell Samples 

15 Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and 

poly A" RNA is purified using the oligo (dT) cellulose method. Each polyA* RNA sample is reverse 
transcribed using MMLV reverse-transcriptase, 0.05 pg/^l oligo-dT primer (21mer), IX first strand 
buffer, 0,03 units/^il RNase inhibitor, 500 ^M dATP, 500 dGTP, 500 jiM dTTP, 40 fiM dCTP, 40 
|LiM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech). The reverse transcription 

20 reaction is performed in a 25 ml volume containing 200 ng polyA"^ RNA with GEMBRIGHT kits 
(Incyte). Specific control polyA"^ RNAs are synthesized by in vitro transcription from non-coding 
yeast genomic DNA (W. Lei, unpublished). As quantitative controls, the control mRNAs at 0.002 ng, 
0.02 ng, 0.2 ng, and 2 ng are diluted into reverse transcription reaction at ratios of 1: 100,000, 
1 : 1 0,000, 1 : 1 000, 1 : 1 00 (w/w) to sample mRNA respectively. The control mRNAs are diluted into 

25 reverse transcription reaction at ratios of 1:3, 3: 1, 1:10, 10:1, 1:25, 25: 1 (w/w) to sample mRNA 
differential expression patterns. After incubation at 37° C for 2 hr, each reaction sample (one with 
Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and incubated 
for 20 minutes at 85° C to the stop the reaction and degrade the RNA. Probes are purified using two 
successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc. 

30 (CLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated 
using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The probe is 
then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook NY) and 
resuspended in 14 |li1 5X SSC/0.2% SDS. 
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Microarrav Preparation 

Sequences of the present invention are used to generate array elements. Each array element 
is amplified from bacterial cells containing vectors with cloned cDNA inserts. PGR amplification 
uses primers complementary to the vector sequences flanking the cDNA insert. Array elements are 
amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 
fig. Amplified array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia 
Biotech). 

Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 
slides (Coming) are cleaned by ultrasound in 0. 1% SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
Scientific Products Corporation (VWR), West Chester, PA), washed extensively in distilled water, 
and coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated slides are cured in a 
1 lO^C oven. 

Array elements are applied to the coated glass substrate using a procedure described in US 
Patent No. 5,807,522, incorporated herein by reference. 1 ^1 of the array element DNA, at an average 
concentration of 100 ng/jal, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

Microarrays are UV-crosslinked using a STRATALESTKER UV-crossl inker (Stratagene). 
Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline (PBS) (Tropix, Inc., Bedford, MA) for 30 minutes at 60'' C followed by washes in 
0.2% SDS and distilled water as before. 

Hybridization 

Hybridization reactions contain 9 \x\ of probe mixture consisting of 0.2 jig each of Cy3 and 
Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The probe mixture 
is heated to 65"* C for 5 minutes and is aliquoted onto the microarray surface and covered with an 1 .8 
cm- coverslip. The arrays are transferred to a waterproof chamber having a cavity just slightly larger 
than a microscope slide. The chamber is kept at 100% humidity internally by the addition of 140 pi 
of 5x SSC in a comer of the chamber. The chamber containing the arrays is incubated for about 6.5 
hours at 60**C. The arrays are washed for 10 min at 45°C in a first wash buffer ( 1 X SSC, 0.1% SDS), 
three times for 10 minutes each at 45** C in a second wash buffer (O.IX SSC), and dried. 

Detection 

Reporter-labeled hybridization complexes are detected with a microscope equipped with an 
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Innova 70 mixed gas 10 W laser (Coherent, Inc.. Sama Clara CA) capable of generating spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light is 
focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY), The slide 
containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- 
scanned past the objective. The 1 .8 cm x 1,8 cm array used in the present example is scanned with a 
resolution of 20 micrometers. 

In two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. 
Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 
Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate 
filters positioned between the array and the photomultiplier tubes are used to filter the signals. The 
emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5. Each array is 
typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, 
although the apparatus is capable of recording the spectra from both fluorophores simultaneously. 

The sensitivity of the scans is typically calibrated using the signal intensity generated by a 
cDNA control species added to the probe mix at a known concentration. A specific location on the 
array contains a complementary DNA sequence, allowing the intensity of the signal at that location to 
be correlated with a weight ratio of hybridizing species of 1 : 1 00,000. When two probes from 
different sources (e.g., representing test and control cells), each labeled with a different fluorophore, 
are hybridized to a single array for the purpose of identifying genes that are differentially expressed, 
the calibration is done by labeling samples of the calibrating cDNA with the two fluorophores and 
adding identical amounts of each to the hybridization mixture. 

The output of the photomultiplier tube is digitized using a ll-bit RTI-835H analog-to-digital 
(A/D) conversion board (Analog Devices, Inc., Norwood, MA) installed in an IBM-compatible PC 
computer. The digitized data are displayed as an image where the signal intensity is mapped using a 
linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high 
signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and 
measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping 
emission spectra) between the fluorophores using each fluorophore emission spectrum. 

A grid is superimposed over the fluorescence signal image such that the signal from each spot 
is centered in each element of the grid. The fluorescence signal within each element is then 
integrated to obtain a numerical value corresponding to the average intensity of the signal. The 
software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). 

XII. Complementary Nucleic Acids 

Sequences complementary to the mddt are used to detect, decrease, or inhibit expression of 
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the naturally occurring nucleotide. The use of oligonucleotides comprising from about 15 to 30 base 
pairs is typical in the art. However, smaller or larger sequence fragments can also be used. 
Appropriate oligonucleotides are designed from the mddt using OLIGO 4.06 software (National 
Biosciences) or other appropriate programs and are synthesized using methods standard in the an or 
5 ordered from a commercial supplier. To inhibit transcription, a complementary oligonucleotide is 
designed from the most unique 5' sequence and used to prevent transcription factor binding to the 
promoter sequence. To inhibit translation, a complementary oligonucleotide is designed to prevent 
ribosomal binding and processing of the transcript. 

10 XIII. Expression of MDDT 

Expression and purification of MDDT is accomplished using bacterial or vims-based 
expression systems. For expression of MDDT in bacteria, cDNA is subcldned into an appropriate 
vector containing an antibiotic resistance gene and an inducible promoter that directs high levels of 
cDNA transcription- Examples of such promoters include, but are not limited to, the trp-lac {tac) 

15 hybrid promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator 
regulatory element. Recombinant vectors are transformed into suitable bacterial hosts, e.g., 
BL21(DE3). Antibiotic resistant bacteria express MDDT upon induction with isopropyl beta-D- 
thiogalactopyranoside (IPTG). Expression of MDDT in eukaryotic cells is achieved by infecting 
insect or mammalian cell lines with recombinant Autographica califomica nuclear polyhedrosis virus 

20 (AcMNPV), commonly known as baculovims. The nonessential polyhedrin gene of bacul o virus is 
replaced with cDNA encoding MDDT by either homologous recombination or bacterial-mediated 
transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong 
polyhedrin promoter drives high levels of cDNA transcription. Recombinant baculovims is used to 
infect Spodoptera fnigiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases. 

25 Infection of the latter requires additional genetic modifications to baculovims. (See e.g., Engelhard, 
supra : and Sandig, supra .) 

In most expression systems, MDDT is synthesized as a fusion protein with, e.g., glutathione 
S-transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, 
affinity-based purification of recombinant fusion protein from cmde cell lysates. GST, a 26- 

30 kilodalton enzyme from Schistosoma iaponicum , enables the purification of fusion proteins on 

immobilized glutathione under conditions that maintain protein activity and antigenicity (Amersham 
Pharmacia Biotech). Following purification, the GST moiety can be proteolytically cleaved from 
MDDT at specifically engineered sites. FLAG, an 8-amino acid peptide, enables immunoaffinity 
purification using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman 

35 Kodak Company, Rochester NY). 6-His, a stretch of six consecutive histidine residues, enables 
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purification on metal-chelate resins (QIAGEN). Methods for protein expression and purification are 
discussed in Ausubel (1995, supra . Chapters 10 and 16). Purified MDDT obtained by these methods 
can be used directly in the following activity assay. 

5 XIV, Demonstration of MDDT Activity 

MDDT, or biologically active fragments thereof, are labeled with Bolton-Hunter reagent. 
(See, e.g., Bolton, A.E. and W.M. Hunter (1973) Biochem. J. 133:529-539.) Candidate molecules 
previously arrayed in the wells of a multi-well plate are incubated with the labeled MDDT, washed, 
and any wells with labeled MDDT complex are assayed. Data obtained using different 

10 concentrations of MDDT are used to calculate values for the number, affinity, and association of 
MDDT with the candidate molecules. 

Alternatively, molecules interacting with MDDT are analyzed using the yeast two-hybrid 
system as described in Fields, S. and O. Song (1989) Nature 340:245-246, or using conunercially 
available kits based on the two-hybrid system, such as the MATCHMAKER system (CLONTECH). 

1 5 MDDT may also be used in the PATHCALLING process (CuraGen Corp., New Haven CT) 

which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions 
between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. 
Patent No. 6,057,101). 

20 XV. Functional Assays 

MDDT function is assessed by expressing mddt at physiologically elevated levels in 
mammalian cell culture systems. cDNA is subcloned into a mammalian expression vector containing 
a strong promoter that drives high levels of cDNA expression. Vectors of choice include pCMV 
SPORT (Life Technologies) and pCR3,l (Invitrogen Corporation, Carlsbad CA), both of which 

25 contain the cytomegalovirus promoter. 5-10 jig of recombinant vector are transiently transfected into 
a human cell line, preferably of endothelial or hematopoietic origin, using either liposome 
formulations or electroporation. 1 -2 pg of an additional plasmid containing sequences encoding a 
marker protein are co-transfected. 

Expression of a marker protein provides a means to distinguish transfected cells from 

30 nontransfected cells and is a reliable predictor of cDNA expression from the recombinant vector. 
Marker proteins of choice include, e.g.. Green Fluorescent Protein (GFP; CLONTECH), CD64, or a 
CD64-GFP fusion protein. Flow cytometry (FCM), an automated laser optics-based technique, is 
used to identify transfected cells expressing GFP or CD64-GFP and to evaluate the apoptotic state of 
the ceils and other cellular properties. 
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FCM detects and quantifies the uptake of fluorescent molecules that diagnose events 
preceding or coincident with cell death. These events include changes in nuclear DNA content as 
measured by staining of DNA with propidium iodide; changes in cell size and granularity as measured 
by forward light scatter and 90 degree side light scatter; down-regulation of DNA synthesis as 
measured by decrease in bromodeoxyuridine uptake; alterations in expression of cell surface and 
intracellular proteins as measured by reactivity with specific antibodies; and alterations in plasma 
membrane composition as measured by the binding of fluorescein-conjugated Annexin V protein to 
the cell surface. Methods in flow cytometry are discussed in Ormerod, M. G. (1994) Flow 
Cytometry . Oxford, New York NY. 

The influence of MDDT on gene expression can be assessed using highly purified 
populations of cells transfected with sequences encoding MDDT and either CD64 or CD64-GFP. 
CD64 and CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions 
of human immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected 
cells using magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Inc., 
Lake Success NY), mRNA can be purified from the cells using methods well known by those of skill 
in the art. Expression of mRNA encoding MDDT and other genes of interest can be analyzed by 
northern analysis or microarray techniques. 

XVI. Production of Antibodies 

MDDT substantially purified using poiyacrylamide gel electrophoresis (PAGE; see, e.g., 
Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 
immunize rabbits and to produce antibodies using standard protocols. 

Alternatively, the MDDT amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogenicity, and a corresponding peptide is 
synthesized and used to raise antibodies by means known to those of skill in the art. Methods for 
selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well 
described in the art. (See, e.g., Ausubel, 1995, supra . Chapter 1 1 .) 

Typically, peptides 15 residues in length are synthesized using an ABI 431 A peptide 
synthesizer (PE Biosystems) using fmoc-chemistry and coupled to KLH (Sigma) by reaction with N- 
maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to increase immunogenicity. (See, e.g., 
Ausubel, supra .) Rabbits are immunized with the peptide-KLH complex in complete Freund's 
adjuvant. Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to 
plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio- 
iodinated goat anti-rabbit IgG. Antisera with antipeptide activity are tested for anti-MDDT activity 
5 using protocols well known in the art, including ELISA, RIA, and immunoblotting. 
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XVII. PuriHcation of Naturally Occurring MDDT Using Specific Antibodies 

Naturally occurring or recombinant MDDT is substantially purified by immunoaffinity 
chromatography using antibodies specific for MDDT. An immunoaffinity column is constructed by 
covalently coupling anti-MDDT antibody to an activated chromatographic resin, such as 
CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the coupling, the resin is 
blocked and washed according to the manufacturer's instructions- 
Media containing MDDT are passed over the immunoaffinity column, and the column is 
washed under conditions that allow the preferential absorbance of MDDT (e.g., high ionic strength 
buffers in the presence of detergent). The column is eiuted under conditions that disrupt 
antibody/MDDT binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such 
as urea or thiocyanate ion), and MDDT is collected. 

All publications and patents mentioned in the above specification are herein incorporated by 
reference. Various modifications and variations of the described method and system of the invention 
will be apparent to those skilled in the art without departing from the scope and spirit of the 
invention. Although the invention has been described in connection with specific preferred 
embodiments, it should be understood that the invention as claimed should not be unduly limited to 
such specific embodiments. Indeed, various modifications of the above-described modes for carrying 
out the invention which are obvious to those skilled in the field of molecular biology or related fields 
are intended to be within the scope of the following claims. 
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TABLE 3 



SEQ ID NO: 


Template ID 


Start 


Stop 


Frame 


Domain Type 


1 


222197.6 


317 


406 


forward 2 


SP 


1 


222197.6 


901 


984 


forward 1 


TM 


2 


227709.3 


563 


649 


forward 2 


SP 


5 


243096.6 


3096 


3182 


fOHA/ard 3 


SP 


6 


244366.6 


2801 


2878 


forward 2 


TM 


7 


405313.4 


2256 


2333 


forward 3 


TM 


7 


405313.4 


1603 


1589 


forward 3 


TM 
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SEQ ID NO: Template ID 

1 222197.6 

1 732^91.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197,6 

1 222197,6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 

1 222197.6 



TABLE 4 



Component ID 


Start 


Stop 


3989355H1 


1 


122 


3989355R6 


1 


462 


gl 189739 


58 


533 


gl 123521 


56 


494 


3417884H2 


105 


341 


33989 16H1 


in 


329 


696738H1 


228 


480 


3387328H1 


248 


542 


3387328F6 


248 


705 


640954H1 


499 


771 


640954R1 


499 


841 


2674395H1 


544 


647 


4871937H1 


609 


809 


6014949H1 


680 


954 


1310167H1 


729 


952 


1310167F6 


729 


1163 


3422058H1 


733 


986 


1429773H1 


748 


1016 


1429773F6 


748 


1211 


4725459H1 


770 


889 


2692245H1 


774 


1025 


2692245F6 


774 


1300 


2658283H1 


818 


1051 


4402233H1 


847 


1083 


673783H1 


847 


1089 


487422H1 


871 


1123 


3928678H1 


898 


1175 


2641613F6 


1019 


1494 


264161 3H1 


1019 


1259 


2770396H1 


1027 


1273 


2599469H1 


1040 


1311 


48611 5H1 


1181 


1456 


1626615H1 


1247 


1456 


1626615F6 


1247 


1728 


383522H1 


1260 


1526 


3355a67H1 


1261 


1532 


3617236H1 


1286 


1573 


3510978H1 


1300 


1567 


1568105H1 


1325 


1446 


1571377H1 


1326 


1550 


3806389H1 


1368 


1628 


g774888 


1369 


1729 


2995341 HI 


1382 


1634 


5547807H1 


1412 


1611 


619375H1 


1501 


1738 


g 1962367 


1501 


1997 


2695323H1 


1517 


1790 


3142880H1 


1518 


1792 


3805357H1 


1518 


1820 


1962884H1 


1538 


1808 
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SEQ ID NO: 




TABLE 4 



Template ID Component ID 



222197.6 


1807088F6 


222197.6 


1400677H1 


222197.6 


811652H1 


222197.6 


304811 OH 1 


222197.6 


30481 10F6 


222197.6 


30481 02H1 


222197.6 


5059358H1 


222197.6 


215G138H1 


222197.6 


039720H1 


222197.6 


2434795H1 


222197,6 


21 30701 HI 


222197.6 


2647674H1 


222197.6 


34489 15H1 


222197.6 


43772 15H1 


222197.6 


4317587H1 


222197.6 


1851036H1 


222197.6 


51 08761 HI 


222197.6 


2893265H1 


222197.6 


1568060H1 


222197.6 


1568004H1 


222197.6 


3531152H1 


222197.6 


2851378H1 


222197.6 


1931202H1 


222197.6 


21 32091 R6 


222197.6 


21 32091 HI 


222197.6 


5288578H1 


222197.6 


92110737 


222197.6 


2207507H1 


222197.6 


2345452H1 


222197.6 


3414767H1 


222197.6 


13061 71 F6 


222197.6 


1306171H1 


222197.6 


4588038H1 


222197.6 


4587760H1 


222197.6 


1389765H1 


222197.6 


gl 137612 


222197.6 


26637 17H1 


222197.6 


3321090H1 


222197.6 


3840347H1 


222197.6 


2415559F6 


222197.6 


2415559H1 


222197.6 


3146224H1 


222197.6 


4201740H1 


222197.6 


371 3261 HI 


222197.6 


5900620H1 


222197.6 


1239238H1 


222197.6 


1965353R6 


222197.6 


1965363H1 


222197.6 


1471606H1 


222197.6 


3929839H1 




PCTAJSOO/15344 



start 


Stop 


1555 


2037 


1628 


1894 


1660 


1954 


1665 


1918 


1665 


1992 


1665 


1965 


1668 


1965 


1671 


1925 


1711 


1970 


3030 


3132 


3039 


3136 


1740 


1841 


1810 


2065 


1841 


2043 


1841 


1916 


1841 


2033 


1861 


2107 


1873 


2139 


1878 


2083 


1878 


2097 


1889 


2207 


1922 


2260 


1922 


2196 


1936 


2208 


1936 


2096 


1941 


2067 


1944 


2226 


1996 


2250 


1996 


2256 


2025 


2265 


2042 


2378 


2042 


2284 


2066 


2349 


2066 


2221 


2108 


2367 


2112 


2425 


2146 


2388 


2152 


2435 


2151 


2339 


2175 


2618 


2175 


2420 


2178 


2430 


2178 


2451 


2194 


2446 


2195 


2484 


2205 


2358 


2235 


2691 


2235 


2500 


2273 


2483 


2278 


2578 
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TABLE 4 

SEQ ID NO: Template ID Component ID 

1 222197.6 1521966H1 

1 222197.6 1449385H1 

1 222197.6 2381896H1 

1 222197.6 2381895H1 

1 222197.6 4643037H1 

1 222197.6 3703243H1 

1 222197.6 42951 30H1 

1 222197.6 42961 84H1 

1 222197.6 5841522H2 

1 222197.6 3790395F6 

1 222197.6 38n615Hl 

1 222197.6 2353349H1 

1 222197.6 56981 7H1 

1 222197.6 1621792H1 

1 222197.6 520836H1 

1 222197.6 92161759 

1 222197.6 1463438H1 

1 222197.6 1621792T6 

1 222197.6 3475011 HI 

1 222197.6 1969764H1 

1 222197.6 4188958H1 

1 222197.6 2134836H1 

1 222197.6 4054390H1 

1 222197.6 5185060H1 

1 222197.6 4058390H1 

1 222197.6 4024007H1 

1 222197.6 5597388H1 

1 222197.6 3934968H1 

1 222197.6 2770396T6 

1 222197.6 993964H1 

1 222197.6 1807088T6 

1 222197.6 21 32091 T6 

1 222197.6 1965395T6 

1 222197.6 1805709H1 

1 222197.6 4466288H1 

1 222197.6 3020435H1 

1 222197.6 g2355832 

1 222197.6 1672661 HI 

1 222197.6 1881147H1 

1 222197.6 509831 6H1 

1 222197.6 1429773T6 

1 222197.6 162661 516 

1 222197.6 1479854T6 

1 222197.6 3935053H1 

1 222197.6 39309 18H1 

1 222197.6 1654064H1 

1 222197.6 2951 301 HI 

1 222197.6 94223642 

1 222197.6 2752320H1 

1 222197.6 92161260 




PCT/USOO/15344 



Start 


Stop 


2278 


2474 


2291 


2533 


2307 


2560 


2307 


2559 


2326 


2554 


2351 


2650 


2350 


2618 


2350 


2589 


2396 


2675 


2413 


2974 


2413 


2745 


2414 


2513 


2413 


2660 


2413 


2625 


2417 


2637 


2433 


2797 


2433 


2622 


2437 


3096 


2442 


2682 


2447 


2686 


2451 


2774 


2458 


2578 


2467 


2749 


2468 


2694 


2468 


2580 


2469 


2784 


2493 


2771 


2512 


2787 


2518 


3095 


2526 


2698 


2531 


3099 


2530 


3101 


2532 


3098 


2532 


2781 


2537 


2803 


2537 


2821 


2538 


3035 


2554 


2667 


2554 


2807 


2573 


2856 


2573 


3089 


2584 


3091 


2587 


3117 


2598 


2897 


2598 


2915 


2608 


2860 


2619 


2908 


2627 


3028 


2628 


2928 


2634 


3031 
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Templote ID 


Component ID 


Start 


StOD 


222197.6 


13061 71T6 


2636 


3099 

WW 7 7 


222197.6 


3387328T6 


2640 


.^009 


222197.6 


81 1652T6 


2639 


^OQ7 


222197.6 


1959841H1 


9659 


oom 


222197.6 


1 96984 1T6 


9A59 


'^OO^ 

OU7«* 


222197.6 


1 959841 R6 




^1 in 


222197.6 


n49Af5714 


9A*W 
zooo 




222197 6 




OAAO 


o toy 


222197 6 


a39-dQ7Al 


OAA^ 
^OOu 


n VIA 


1 7 / ,0^ 


n^l 1^07n 

y** 1 i^iy/u 


OAA1 


3136 


0001Q7 A 


/uoijoun 1 


OAAA 


2908 






2673 


2971 


O0O1O7 A 

.^IZ^C 1 7/.0 


Oil 1 J^*^ ROTA 




3097 


000107 A 


g4oy4oo2 


2678 


3067 


000107 A 


g^io 1 / 967 


2680 


3140 


OOO 1 07 A 

ly/.o 


g 1 o4oo6o 


2685 


3028 


OOOIO"? A 


o/y0395T6 


2684 


3120 




g4o93109 


2684 


3136 


0001 O"? A 


Irti C7^ii m 

191 5/61 Hi 


2689 


2948 


00010*7 A 


QZ 1 OO/ 74 


2702 


3136 




y9635UHl 


2729 


2980 




y96350Rl 


2729 


3028 


00010*7 A 


yVoooUT 1 


2729 


2992 


OOOIO"? A 


Vy/4o4nl 


2731 


3034 


OOOIOV A 


AUf\ 1 7 Art A. 


2736 


2990 


000107 A 


4oU 1 / oRo 


2736 


3028 


O0O1O7 A 

ly/ .o 


01 01 A7U1 

y 1 z 1 o/n 1 


27 AA 


3042 


OOO 107 A 


1 o 1 oooori i 


07 >l yi 


3006 


17/ .1^ 


n A'%A 1 0A 

y OOO 1 yo 


07A0 

4i/00 


3141 


990107 A 

1 7/ .(J 


ouo/u 1 1 n 1 


^/o4 


3089 


999107 A 


OOAOTAAWl 

^^ou/ oori i 




3062 


999107 A 


0^7*^7^!^ 


Olio 9 


olo9 


999107 A 


1 AAAnAHMl 
1 uoouown 1 


ofiin 


cuA 1 


222197.6 


089 1A9^ 


OA'^A 
4CO«30 




222197.6 




Oft^*^ 
ZOOO 


tjuy^ 


222197.6 


2328044H1 




Olio 


222197.6 


a4371777 




^1^1 

O l*» 1 


222197.6 


01516072 


9860 




222197.6 


02433046 


9865 


OU7 1 


222197.6 


2648474H1 


98AA 


O 1 


222197.6 


2434653H1 


2871 


3069 


222197.6 


1878079H1 


2881 


3147 


222197.6 


g4109641 


2896 


3139 


222197.6 


242851 4H1 


2909 


3098 


222197.6 


4146636H1 


2909 


3172 


222197.6 


4703838H1 


'2ff29 


3139 


222197.6 


3125420H1 


2934 


3139 


222197.6 


5942277H1 


2971 


3137 


227709.3 


783646H1 


1577 


1867 


227709.3 


2314211 HI 


1585 


1834 
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TABLE 4 



ir^ MO- 
IL/ iNV-./. 




vca-orriponenT lu 


oTQiT 


OTOp 


o 




o^zooon 1 


1 DV\J 


1 ooz 


o 


907 7nO 'I 

zz/ /uy.o 


1 ft 'R'^ 9^1 DA 


1 oyo 


ZUU*4 


o 


zz/ /uy.o 


- 1 QOOZ4 1 n 1 


I ovo 


1 000 


o 


ZZ/ /UV.vJ 


1 ^/l 1 1 7WT 
1 0^ M / n 1 


1 Ano 
1 ouy 


1 7R9 

1 /oz 


o 


ZZ/ /UV..0 


^yoo I oon 1 


1 AT 7 
101/ 


1 on^^ 
1 yuo 


o 


9977nO 

zz/ /uy.o 


*4y 1 zo / 1 n 1 


1 A9ft 
1 ozo 


1 Ol ft 

1 y 1 0 


o 


997 7nO 

zz/ /uy.o 


zz^oyo/n 1 


1 000 


1 A71 
1 0/ 1 


n 

£. 


997 7nO 

zz / /uy.o 


O/OOOUri 1 


1 oou 


1 AftO 
1 OoZ 


o 


9977nO <l 

zz/ /uy.o 


/1O'\O7ft0U 1 

^yov/ozn 1 


1 ooz 


1 /oo 


o 


9977nO 'I 

zz/ /uy.o 


t oovouyn 1 


1 AA'5 

1 000 


1 07 /I 

! y/4 


o 


9977nO <i 

zz / /uy.o 


I 0/C50U^n 1 


1 oyo 


1 O/IO 

1 y4u 


z. 


997 7nO ^5 

zz/ /uy.o 


loouyzon 1 


1 AOA 

1 oyo 


1 01A 


o 
Z. 


9977nO '5 

zz/ /uV.o 


1 OO/IOI 7U 1 

1 oyAy 1 /n 1 


1 AOA 

1 oyo 


1 01 0 

f y 1 z 


o 

Z. 


Zz/ /uy.o 


zoVAv \ 1 


1 7 1 A 

1 / 10 


1 Q 1 c: 


z 


zz / /uy.o 


oU0O0/'4n 1 


1 71 >! 

1/14 


OOQ yi 


z. 


zz/ /uy.o 


oUoo/on 1 


T 7 1 >1 

1/14 


00 1 0 

zu ly 


o 


zz / /uy.o 


oUoo loon 1 


171/1 

1/14 


zUoo 


o 
z 


zz / /uy.o 


QOQ £^7Qm 


1 7 1 y| 
1/14 


O'ic 1 

zool 


o 
z 


zz//uy.o 


oooZyUzn 1 


1 7 1 A 

I / 1 0 


1 yo4 


z 


zz / /uy.o 


0 1 zzuozn 1 


1 700 

1 /zy 


007C 

zU/o 


o 
z 


zz/ /UV.O 


0 1 u/oyun 1 


1 700 

1 /zy 


00 /I A 

ZU40 


z 


zz/ /uy.o 


oy ZD^ZZn 1 


1 7Qc: 
1 /oO 


zuoy 


z 


zz/ /UV.O 


^0/0/ozn 1 


1 7>1A 

1 /40 


ooco 

zuoy 


o 
z 


00V7OO Q 

zz/ /uy.o 


070 1 1 AflUI 1 

0/0 1 loon 1 


1 7 A "5 

1 /Oo 


1 yoo 


o 
z 


zz/ /uy.o 


O/tAOR/lOU 1 

z4oyMzn 1 


1 7A'3 

1 /oo 


OOOA 

zUUo 


z 


zz/ /uy.o 


^OOOOoAn 1 


1 7A£; 

1 /CO 


0071 

zU/ 1 


z 


097700 '3 

zz / /uy.o 


'5^779 T CLOU 1 

0 / / z 1 oy n 1 


1 779 

1 / /z 


90ft7 

ZUo/ 


z 


997 700 Q 

zz / /uy.o 


1 /I <l AOO 1 C A 
I tJoOyU 1 rO 


1 7fl7 

1 /o/ 


00 1 A 

zz 10 


z 


997700 Q 

zz / /uy.o 


1 ^ooyuzn i 


1 7ft7 
1 /o/ 


ZUo 1 




9977nO 

zz/ /uy.o 


1 y1'^A009C 1 

1 ^ooyuzr 1 


1 7ft7 
1 /o/ 


9/11 7 

z4 1 / 


z 


997 700 Q 

zz/ /uy.o 


7Q97Af>LU T 

/oZ/OOn 1 


1 70A 

1 /yo 


zU4o 


o 
z 


097700 

zz/ /uy.o 


CO 1 OQ AU 1 

00 1 ooOri 1 


1 70A 

1 /yo 


ZU04 


z 


9977nO 

zz / /uy.o 


/ OZ/OOK 1 


1 70A 

i /yo 


zooy 


z 


097700 
zz/ /uy.o 


OOQ yiOOUl 


1 70A 

1 /yo 


9070 
ZU/Z 


z 


007700 

zz/ /uy.o 


1 /oo mzri I 


1 ft! 1 
loll 


ZU/Z 


z 


997700 

zz / /uy.o 


zuoooy^n 1 


1 ft! 7 
1 0 1 / 


90ft'^ 
ZUOO 


z 


907700 
zz / /uy.o 


1 00 1 yz/ n 1 


1 oon 
i ozu 


90 "^A 
ZUOO 


z 


997700 ^ 
zz / /uy.o 


t zo 1 z 1 un 1 


1 ft9ft 
1 ozo 


1 yoo 


z 


997700 
zz / / uy.o 


Ai ft 1 ARWl 

0101 oon 1 


1 00^ 


91 '^'^ 

z 1 00 


z 


007700 

zz/ /uy.o 


n79nAAWi 
u/zuoon I 


1 Ooo 


9079 
ZU/Z 


z 


997700 

zz / /uy.o 


yzuoz^n 1 


1 017 
1 00/ 


01 7Q 

z 1 /o 


2 


227709.3 


g2O30O53 


1840 


2258 


2 


227709.3 


g681548 


1845 


2248 


2 


227709.3 


91190789 


1847 


2173 


2 


227709.3 


3052574H1 


1853 


2158 


2 


227709.3 


4646684H1 


1856 


1963 


2 


227709.3 


g 1846206 


1855 


2184 


2 


227709.3 


1231442H1 


1856 


2170 


2 


227709.3 


4546692H1 


1856 


1961 


2 


227709.3 


1231220H1 


1856 


2108 
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TABLE 4 



ID NO: 


Template ID 


Component ID 


Start 


Stop 


2 


227709.3 


5693680H1 


1861 


2153 


2 


227709.3 


2040451 HI 


1861 


2194 


2 


227709.3 


5781387H1 


1861 


2129 


2 


227709.3 


3506145H1 


1861 


2181 


2 


227709.3 


3479633H1 


1869 


2214 


2 


227709.3 


2196589H1 


1861 


2123 


2 


227709.3 


3872090H1 


1867 


2086 


2 


227709.3 


1210943R1 


1867 


2217 


2 


227709.3 


072676H1 


1867 


2104 


2 


227709.3 


3120279H1 


1867 


2163 


2 


227709.3 


1210943H1 


1867 


2127 


2 


227709.3 


5877032H1 


1869 


2133 


2 


227709.3 


5469466H1 


1875 


2177 


2 


227709.3 


030331 HI 


1877 


2151 


2 


227709.3 


g 1970048 


1878 


2203 


2 


227709.3 


225701 2H1 


1883 


2139 


2 


227709.3 


705952H1 


1891 


2209 


2 


227709.3 


2564249H1 


1891 


2202 


2 


227709.3 


5712256H1 


1902 


2222 


2 


227709.3 


3798289H1 


1903 


2214 


2 


227709.3 


456952H1 


1903 


2171 


2 


227709.3 


693406H1 


1910 


2219 


2 


227709.3 


2403540H1 


1915 


2217 


2 


227709.3 


60951 57H1 


1918 


2217 


2 


227709.3 


4257073H1 


1928 


2230 


2 


227709.3 


074494H1 


1936 


2178 


2 


227709.3 


073044H1 


1936 


2246 


2 


227709.3 


073608H1 


1936 


2217 


2 


227709.3 


5882532H1 


1937 


2217 


2 


227709.3 


073991 HI 


1936 


2227 


2 


227709.3 


073890H1 


1936 


2119 


2 


227709.3 


073335H1 


1936 


2162 


2 


227709.3 


5882935H1 


1938 


2217 


2 


227709.3 


58837 16H1 


1938 


2217 


2 


227709.3 


5881208H1 


1939 


2217 


2 


227709.3 


58885 19H1 


1939 


2212 


2 


227709.3 


589021 8H1 


1939 


2212 


2 


227709.3 


47831 88H1 


1938 


2225 


2 


227709.3 


2317709H1 


1941 


2222 


2 


227709.3 


1876721 HI 


1952 


2217 


2 


227709.3 


2469335H1 


1954 


2223 


2 


227709.3 


734056H1 


1953 


2076 


2 


227709.3 


2938267H1 


1957 


2217 


2 


227709.3 


3166669H1 


1957 


2217 


2 


227709.3 


4591368H1 


1966 


2227 


2 


227709.3 


2397855H1 


1966 


2240 


2 


227709.3 


6105204H1 


1965 


2217 


2 


227709.3 


874849H1 


1968 


2217 


2 


227709.3 


4458852H1 


1967 


2217 


2 


227709.3 


874849R1 


1968 


2621 
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TABLE 4 



jn MO- 






OlUl 1 


oiup 


o 






lOAO 
1 vov 


9917 
zz 1 / 


9 




947087916 


lOAA 
1 voo 


9A^/1 
ZQ^IA 


o 








9917 
zz 1 / 


O 




94'^'^v'^78Hl 


900*^ 
zuuo 


91 7*^ 
z 1 / 0 


9 




Z180'^479H 1 
'4ovo*4/ zn 1 


OOO*^ 
zuuo 


9'^ '^7 
ZOO/ 


9 




zovozo*4n 1 


901 A 
ZU 1 0 


9917 
ZZ 1 / 


9 




*4^VJO*-f 0*4 il 1 


901 A 
ZU 1 0 


zoou 


9 


997700 




009 ^^ 

zuzo 


9'^ 00 

zozv 


9 


997700 




009^ 
ZUZO 


900/I 
ZUV^ 


9 


997700 


f^OilOOl Wl 
ou^7 7 1 n I 


90'^A 
zuoo 


9917 
ZZ 1 / 


9 


ZZ/ / o v.o 


o 1 zovzon 1 


00 

zuoo 


9*^1 7 
ZO 1 / 


9 


9977n0 

ZZ/ / \JV 


'^7907ADA 

O / ZV / OKO 


O^'^O 

zoov 


9 AO! 
ZOV 1 


o 

z 


997700 
ZZ / / uv .o 


zooo 1 oon 1 


0R7A 


9 AO A 
ZOVo 


9 


9977O0 
ZZ./ /vjy.o 


91 '^ftO/IM 1 
z 1 oovj*4n 1 


OA'^0 
ZOOZ 


9AA7 
ZOO/ 


9 


997700 
Zz/ /UV.o 


OO^/ ZVZrl 1 


^0 1 


OvO 


9 


007700 *^ 
Zz/ / UV.v3 


OU 1 Ovoon 1 


RI A 


/VO 


9 


9977O0 
ZZ / /UV.o 


^OZ/ 1 Uorl 1 


oz/ 


7A1 
/O 1 


9 


9077nO 
Zz/ /UV.O 


OZVOU^On 1 


OOv 


mi 
0 1 I 


9 


997 700 

ZZ/ /uy.o 


1 ZOOVrl 1 


AO! 
Oz I 


AAQ 


9 

£. 


907700 
ZZ/ /KJV.O 


1 0/ / 1 0 I r 1 


00/ 


IU-40 


9 


007700 7 
ZZ/ /UV.O 


1 "^771 ft! t-4 T 
I O/ / 1 o 1 M 1 


00/ 


AAO 
C50U 


9 


907700 
ZZ / / UV.O 


Qooo V 1 zn 1 


AA'^ 
000 


A'^A 


9 

Z. 


Z^/ / UV.O 


OOOOOOZri 1 


AA*^ 
000 


007 
Vz/ 


9 


zz / /\J7,0 


ooo I zouri 1 


AA/1 


VAo 


9 


9977O0 
zz/ /\JV,0 


0770 70W 1 

0/ / 0/ vn 1 


AA7 
00/ 


O/IA 
V^O 


9 
Z 


097700 ^ 
ZZ/ /UV.O 


9nOA^97l-ll 
zuvooZ/ri 1 


A07 
OV/ 


Oj19 
V*iZ 


9 
z 


ZZ/ / UV.O 


'^^11 OOAWl 


79A 
/Zo 


A7'^ 
0/0 


9 
Z 


9977O0 
zz/ / UV.O 


yiri^tl 097W1 
*tuo 1 vz/ ri 1 


7'^A 
/OO 


wo 


9 
z 


ZZ/ i\jrf»\j 


9il^AAnAW 1 


7'^A 
/OO 


077 
V/ / 


9 
z 


zz / / UV.O 


9if17^'=iO'^TA 

Z*4 / OOVO 1 0 


/ *40 


1 OOZ 


9 
z 


0011 Vf> *^ 
zz / / . o 


9^2lZl'^97Wl 
z**^'40z/ n 1 




V/ 0 


9 
z 


0977OO 

zz/ / UV.O 


^ AAAA/tt-l 1 


77/1 
/ /^ 


lO^A 
lUOO 


9 
z 


zz/ / UV.O 


OOyiSO'^Pl 
yz^ovoK 1 


77A 
/ /o 




9 
z 


zz/ / wV,0 


vz*4\jvon 1 


77A 
/ /o 




9 




9707*S'^7TA 

Z/ U/ v?0/ 1 0 


701 

/V 1 


1 OOZ 


9 

Cm 


997709 


1 849'^9'^TA 


ftO*^ 

OwO 


1 00 1 


9 


zz/ / VJV.O 


1 "^AAyOAM 1 
1 000/ zon 1 


0 i 0 


1 100 
1 1 uv 


9 


/ / wTr . O 


1 *40vJ00/ n 1 


A1 A 


107 A 


2 






ftl A 




9 

€L 




1 o*4zozon 1 


Al A 
0 1 0 


inoo 
1 uuv 


9 
z 


Zz/ /UV.O 


1 A49'^9'^DA 
1 O^ZOZOKO 


Al A 
0 1 0 


1 'kAl 
1 0^/ 


2 


227709.3 


2757452H1 


862 


1137 


2 


227709.3 


33891 7H1 


874 


1099 


2 


227709.3 


g 1382744 


934 


1319 


2 


227709.3 


g2880866 


952 


1325 


2 


227709.3 


6289932H1 


964 


1212 


2 


227709.3 


736987R6 


967 


1219 


2 


227709.3 


92955000 


966 


1369 


2 


227709.3 


736987H1 


967 


1187 


2 


227709.3 


4792654H1 


971 


1250 
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TABLE 4 



ID NO: 


Template ID 


Component ID 


Start 


Stop 


2 


227709.3 


270359H1 


989 


1337 


2 


227709.3 


2532882H1 


985 


1263 


2 


227709.3 


3246655H1 


990 


1264 


2 


227709,3 


g2 106694 


1001 


1373 


2 


227709.3 


6210707H1 


1069 


1397 


2 


227709.3 


340933H1 


1086 


1262 


2 


227709.3 


463891 4H1 


1094 


1344 


2 


227709.3 


6211794H1 


1106 


1397 


2 


227709.3 


1220278H1 


1148 


1408 


2 


227709.3 


47629 16H1 


1193 


1486 


2 


227709.3 


6208765H1 


1202 


1504 


2 


227709.3 


3167466H1 


1208 


1497 


2 


227709.3 


4541633H1 


1218 


1466 


2 


227709.3 


1389793H1 


1 


242 


2 


227709.3 


1221 361 HI 


146 


324 


2 


227709.3 


6210207H1 


1217 


1524 


2 


227709.3 


6208687H1 


1217 


1448 


2 


227709.3 


432063H1 


149 


454 


2 


227709.3 


2197686H1 


1241 


1389 


2 


227709.3 


3871923H1 


142 


426 


2 


227709.3 


2473593H1 


205 


438 


2 


227709.3 


014086H1 


1244 


1539 


2 


227709.3 


g 1809628 


1244 


1573 


2 


227709.3 


g570725 


1274 


1510 


2 


227709.3 


013903H1 


1285 


1535 


2 


227709.3 


3159468H1 


1288 


1590 


2 


227709.3 


5219130H1 


1303 


1575 


2 


227709.3 


2473593F6 


205 


346 


2 


227709.3 


862771 HI 


1304 


1579 


2 


227709.3 


336851 OH 1 


1323 


1609 


2 


227709.3 


2473008H1 


1340 


1591 


2 


227709.3 


36331 33H1 


1353 


1666 


2 


227709.3 


2470872F6 


267 


453 


2 


227709.3 


2472485H1 


1367 


1615 


2 


227709.3 


3940725H1 


1371 


1561 


2 


227709.3 


1839612H1 


1376 


1668 


2 


227709.3 


1839635H1 


1376 


1702 


2 


227709,3 


2440859H1 


1388 


1640 


2 


227709.3 


2560406H1 


1389 


1677 


2 


227709.3 


g776447 


1411 


1595 


2 


227709.3 


g892952 


1432 


1807 


2 


227709.3 


4753851 HI 


1460 


1734 


2 


227709.3 


855862R1 


1460 


2074 


2 


227709.3 


855862H1 


1460 


1682 


2 


227709.3 


5436709H1 


1470 


1708 


2 


227709,3 


3872292H1 


1474 


1684 


2 


227709.3 


2470872H1 


267 


518 


2 


227709.3 


4738304H2 


363 


616 


2 


227709.3 


63461 3H1 


376 


613 


2 


227709.3 


489031 OH 1 


1474 


1746 
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TABLE 4 



ID NO: 


Templat© ID 


Component ID 


Start 


Stop 


2 


227709.3 


964305H1 


396 


658 


2 


227709.3 


g 1303518 


1483 


2011 


2 


227709,3 


4081986H1 


1497 


1695 


2 


227709.3 


2731973H1 


1501 


1745 


2 


227709.3 


964305R1 


396 


999 


2 


227709.3 


4984432H1 


398 


674 


2 


227709.3 


4323306H1 


407 


691 


2 


227709.3 


656683H1 


431 


693 


2 


227709.3 


656689H1 


431 


723 


2 


227709.3 


cJ7A%37 


1512 


1815 


2 


227709.3 


a573087 


1512 


1840 


2 


227709.3 


30384 19H1 


1514 


1800 


2 


227709,3 


28309 14H1 


1515 


1788 


2 


227709 3 


5206976H2 


1530 


1806 


2 


227709.3 


07333516 


2039 


2650 


o 


227709 3 


a3 1 73537 


2054 


2217 


2 


227709.3 


862087T1 


2058 


2649 


2 


227709.3 


862087H1 


2058 


2344 


2 


227709.3 


a 1955564 


2058 


2421 


2 




2469827T6 


2058 


2648 


o 




2395978H1 


2063 


2318 


2 


227709,3 


183324116 


2074 


2648 


o 
<^ 


227709 3 


12266n9Hl 


2075 


2353 


2 


227709.3 


1 436901 T6 


2075 


2641 


2 


227709.3 


2827163H1 


2083 


2449 


2 


227709 3 


1878980H1 


2083 


2374 


2 


227709.3 


2473240T6 


2083 


2643 


2 


227709,3 


2431389H1 


2087 


2329 


2 


227709.3 


2400824H1 


2087 


2358 


2 


227709.3 


1 879688F6 


2095 


2530 


2 


227709.3 


1879688H1 


2095 


2389 


2 


227709.3 


1 879688T6 

1 ^/ # ir vy Vy • \^ 


2096 


2653 


2 


227709.3 


2195425H1 


2098 


2400 


2 


227709.3 


g 196261 8 


2102 


2694 


2 


227709.3 


2326847H1 


2112 


2374 


2 


227709.3 


4468886H1 


2112 


2414 


2 


227709.3 


3940725T6 


2118 


2654 


2 


227709.3 


2917539H1 


2119 


2418 


2 


227709.3 


1347063H1 


2130 


2371 


2 


227709,3 


554568H1 


2137 


2387 


2 


227709.3 


323431 HI 


2148 


2442 


2 


227709.3 


450205H1 


2150 


2378 


2 


227709.3 


1448863H1 


2150 


2423 


2 


227709.3 


24721 38T6 


2186 


2646 


2 


227709.3 


736987T6 


2206 


2648 


2 


227709.3 


g3932020 


2237 


2687 


2 


227709.3 


1676401 HI 


2247 


2475 


2 


227709.3 


406441 HI 


2259 


2522 


2 


227709.3 


334657H1 


2259 


2512 


2 


227709.3 


5888253H1 


2261 


2496 



64 



wo 00/75298 




PCT/USOO/15344 



TABLE 4 



ID NO: 


Template ID 


Component ID 


Start 


Stop 


2 


227709.3 


3102430H1 


2261 


2530 


2 


227709.3 


6106355H1 


2261 


2524 


2 


227709.3 


g3182016 


2261 


2693 


2 


227709.3 


604792H1 


2261 


2486 


2 


227709.3 


2428425H1 


2261 


2433 


2 


227709.3 


5883372H1 


2261 


2444 


2 


227709.3 


94264131 


2262 


2702 


2 


227709.3 


633186H1 


2262 


2544 


2 


227709.3 


3993266H1 


2261 


2540 


2 


227709.3 


201 7401 HI 


2263 


2427 


2 


227709.3 


2424057H1 


2268 


2536 


2 


227709.3 


4O4680H1 


2273 


2496 


2 


227709.3 


3153874H1 


2288 


2593 


2 


227709.3 


94291477 


2288 


2694 


2 


227709.3 


4061504H1 


2288 


2546 


2 


227709.3 


92968506 


2289 


2694 


2 


227709.3 


3123730H1 


2289 


2606 


2 


227709.3 


3123955H1 


2289 


2589 


2 


227709.3 


g41 15080 


2293 


2694 


2 


227709.3 


308633H1 


2295 


2534 


2 


227709.3 


2397634H1 


2297 


2495 


2 


227709.3 


94332177 


2302 


2694 


2 


227709.3 


94267824 


2303 


2694 


2 


227709.3 


308633F1 


2303 


2687 


2 


227709.3 


308633R1 


2303 


2687 


2 


227709.3 


237 1681 HI 


2309 


2551 


2 


227709.3 


2421774H1 


2309 


2547 


2 


227709.3 


91810042 


2311 


2687 


2 


227709.3 


93958397 


2313 


2677 


2 


227709.3 


4695290H1 


2316 


2582 


2 


227709.3 


2756565H1 


2320 


2630 


2 


227709.3 


9612404 


2323 


2687 


2 


227709.3 


2445854H1 


2327 


2587 


2 


227709.3 


449703H1 


2332 


2458 


2 


227709.3 


2877 181 HI 


2334 


2622 


2 


227709.3 


1211271R1 


2334 


2687 


2 


227709.3 


1211271T1 


2334 


2649 


2 


227709.3 


1211271H1 


2334 


2608 


2 


227709.3 


9616409 


2341 


2660 


2 


227709.3 


2017458H1 


2340 


2619 


2 


227709.3 


1534375H1 


2342 


2572 


2 


227709.3 


3145020H1 


2344 


2683 


2 


227709.3 


1539734H1 


2358 


2600 


2 


227709.3 


37861 74H1 


2363 


2661 


2 


227709.3 


1454741 Fl 


2367 


2687 


2 


227709.3 


1454741 HI 


2367 


2632 


2 


227709.3 


27 17951 HI 


2372 


2548 


2 


227709.3 


1359816H1 


2374 


2618 


2 


227709.3 


1359816F1 


2374 


2694 


2 


227709.3 


9564646 


2385 


2694 
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TABLE 4 



ID NO: 


Template ID 


Component ID 


Start 


Stop 


2 


227709.3 


g 1154316 


2388 


2698 


2 


227709.3 


g891561 


2393 


2707 


2 


227709.3 


24733 12H1 


2393 


2648 


2 


227709.3 


92992779 


2394 


2643 


2 


227709.3 


1218580T6 


2395 


2649 


2 


227709.3 


1218580T1 


2395 


2649 


2 


227709.3 


9824343 


2396 


2719 


2 


227709.3 


1218580R6 


2395 


2691 


2 


227709.3 


1218573H1 


2395 


2641 


2 


227709.3 


286961 HI 


2403 


2690 


2 


227709.3 


43671 90H1 


2408 


2684 


2 


227709.3 


23991 22H1 


2420 


2672 


2 


227709.3 


5882896H1 


2423 


2687 


2 


227709.3 


9967282 


2422 


2700 


2 


227709.3 


5883904H1 


2423 


2687 


2 


227709.3 


5883908H1 


2423 


2687 


2 


227709.3 


6882375H1 


2424 


2567 


2 


227709.3 


91191305 


2429 


2710 


2 


227709.3 


794290H1 


2437 


2664 


2 


227709.3 


520742H1 


2442 


2678 


2 


227709.3 


9646174 


2443 


2687 


2 


227709.3 


1538631H1 


2443 


2651 


2 


227709.3 


1722285H1 


2444 


2677 


2 


227709.3 


91202716 


2469 


2701 


2 


227709.3 


862903T1 


2482 


2648 


2 


227709.3 


095638H1 


2484 


2694 


2 


227709.3 


862903R1 


2484 


2694 


2 


227709.3 


3124846H1 


2490 


2692 


2 


227709.3 


59O6470H1 


2497 


2691 


2 


227709,3 


25351 62H1 


2525 


2651 


2 


227709.3 


4460277H1 


2533 


2694 


2 


227709.3 


372978T6 


2539 


2649 


2 


227709.3 


37297eHl 


2539 


2686 


3 


237703.2 


91963754 


1 


374 


3 


237703.2 


91137733 


95 


407 


3 


237703.2 


9843567 


124 


414 


3 


237703.2 


3070350H1 


189 


477 


3 


237703.2 


307035DF6 


189 


709 


3 


237703.2 


1439542H1 


413 


686 


3 


237703.2 


3203352H1 


437 


711 


3 


237703.2 


92013304 


598 


954 


3 


237703.2 


3799002H1 


701 


1010 


3 


237703.2 


0441 60T6 


958 


1453 


3 


237703.2 


824258H1 


1020 


1254 


3 


237703.2 


91894392 


1031 


1472 


3 


237703.2 


3491432H1 


1052 


1315 


3 


237703.2 


2601554F6 


1059 


1602 


3 


237703.2 


2601554H1 


1060 


1336 


3 


237703.2 


4617816H1 


1073 


1337 


3 


237703.2 


40581 86H1 


nil 


1197 
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TABLE 4 



ID NO: 


Template ID 


Component ID 


Start 


Stop 


3 


237703.2 


5554248H1 


1146 


1355 


3 


237703.2 


55541 48H1 


1145 


1386 


3 


237703.2 


g3594366 


1199 


1611 


3 


237703.2 


5122547H1 


1278 


1516 


3 


237703.2 


2278690H1 


1379 


1654 


3 


237703.2 


2278690R6 


1379 


1879 


3 


237703.2 


5564683H1 


1402 


1658 


3 


237703.2 


95051 9H1 


1436 


1678 


3 


237703.2 


95051 9R6 


1436 


1723 


3 


237703.2 


gl 123529 


1437 


1805 


3 


237703.2 


5260937H1 


1514 


1744 


3 


237703.2 


3386649H1 


1527 


1744 


3 


237703.2 


5272970H1 


1532 


1780 


3 


237703.2 


3808872H1 


1533 


1838 


3 


237703.2 


9505 19T6 


1554 


2019 


3 


237703.2 


319252H1 


1592 


1986 


3 


237703.2 


4411457H1 


1669 


1947 


3 


237703.2 


g 1933240 


1669 


2147 


3 


237703.2 


2601564T6 


1698 


2313 


3 


237703.2 


824257T6 


1699 


2313 


3 


237703.2 


28591 38T6 


1776 


2312 


3 


237703.2 


1872409F6 


1788 


2160 


3 


237703.2 


1872409H1 


1788 


2062 


3 


237703.2 


1572418H1 


1798 


1997 


3 


237703.2 


1872409T6 


1805 


2312 


3 


237703.2 


530381 HI 


1810 


1963 


3 


237703.2 


5583255H1 


1811 


2075 


3 


237703.2 


126942H1 


1816 


2025 


3 


237703.2 


2278690T6 


1829 


2311 


3 


237703.2 


12n 984Hl 


1832 


2066 


3 


237703.2 


2703527H1 


1832 


2106 


3 


237703.2 


3253906H1 


1885 


2159 


3 


237703.2 


g3934221 


1896 


2349 


3 


237703.2 


1620273H1 


1897 


2117 


3 


237703.2 


g28 19399 


1908 


2351 


3 


237703.2 


g3895924 


1936 


2349 


3 


237703.2 


g2881 190 


1962 


2270 


3 


237703.2 


040587H1 


1971 


2158 


3 


237703.2 


g3 147053 


1973 


2349 


3 


237703.2 


g843522 


2(D09 


2349 


3 


237703.2 


g 1844904 


2023 


2349 


3 


237703.2 


g2881790 


2024 


2349 


3 


237703.2 


g2820075 


2030 


2349 


3 


237703.2 


g2237723 


2051 


2350 


3 


237703.2 


23851 2H1 


2085 


2313 


3 


237703.2 


292954H1 


2182 


2320 


3 


237703.2 


g20 13921 


2238 


2522 


3 


237703.2 


g 1980268 


2386 


2742 


4 


240091.1 


28981 55H1 


1 


289 


4 


240091.1 


2434264H1 


3 


215 
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SEQ ID NO; 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 



TABLE 4 



Template ID Component ID 



240091. 


1 


5075278H1 


240091. 


1 


2647594H1 


240091. 


1 


4785026H1 


240091. 


1 


4785001 HI 


240091. 


1 


3600323H1 


240091. 


1 


24482 18F6 


240091 . 


1 


24482 18H1 


240091. 


1 


1391 961 T6 


240091 


1 


083349H1 


240091. 


1 


07O643H1 


240091. 


1 


371 2971 T6 


240091. 


1 


3399693H1 


240091 


1 


g3000643 


240091 


1 


93659260 


240091 


1 


34911 02H1 


240091 


1 


2286846H1 


240091 


1 


45751 30H1 


240091 


1 


2584803H1 


240091 


1 


2584803F6 


240091 


1 


4399541 HI 


240091 


1 


489490H1 


240091 


1 


5541073H1 


240091 


.1 


797486H1 


240091 


1 


2435453H1 


240091 


1 


2434264R6 


240091 


.1 


5279 14H1 


240091 


.1 


4382936H1 


240091 


1 


4210908H1 


240091 


1 


3615238H1 


240091 


1 


3615238F6 


240091 


.1 


27331 07H1 


240091 


.1 


494380H1 


240091 


.1 


1391961 F6 


240091 


.1 


1391961 HI 


240091 


.1 


5801 12H1 


240091 


.1 


1232706F6 


240091 


.1 


1232706H1 


240091 


.1 


34871 33H1 


240091 


.1 


94244249 


240091 


.1 


4476 19H1 


240091 


.1 


57822 14H1 


240091 


.1 


4913646F6 


240091 


.1 


4913546H1 


240091 


.1 


3892111 HI 


240091 


.1 


4742244H1 


240091 


.1 


g 1484624 


240091 


.1 


2376485F6 


240091 


.1 


2376485H1 


240091 


.1 


2376485T6 


240091 


.1 


1849607H1 



PCT/USOO/15344 



start 


Stop 


3 


127 


1720 


1770 


1803 


2073 


1803 


2069 


1258 


1557 


1267 


1485 


1267 


1608 


1267 


1730 


1309 


1464 


1309 


1543 


1317 


1744 


1384 


1606 


1400 


1562 


1446 


1772 


1447 


1559 


1496 


1696 


1513 


1756 


1539 


1770 


1539 


1770 


1590 


1833 


1599 


1844 


1605 


1804 


1609 


1772 


3 


231 


3 


491 


4 


275 


10 


241 


20 


292 


47 


340 


47 


528 


54 


275 


62 


307 


66 


475 


66 


318 


359 


558 


389 


842 


389 


629 


432 


698 


511 


981 


580 


799 


670 


964 


830 


1249 


830 


1108 


834 


1130 


836 


1102 


867 


1316 


901 


1205 


901 


1124 


902 


1167 


907 


1198 
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TABLE 4 





lempiai© il' 


^^ornponenT id 


oTarr 


orOp 


A 




4UoUyyzn 1 


voz 


1 lyo 


A 

A 


24U091 . 1 


2/64oo6rl i 


9/0 


1201 


A 


z4UUy 1 . 1 


ylAACOOALJ 1 


1 loo 


1401 


A 

4 


z4uuy 1 . I 


Izoz/ Do lo 


1 1 /O 


1729 


A 

4 


z4UUy 1 . 1 


O yt "5 >1 0 A >l TA 


■join 


1 "7 /IZ 

1746 


A 

4 


24Uuy 1 . 1 


ylyioxo 1 QLJ1 
44^43 1 on 1 


1223 


1459 


c 
O 


z4oUyo.o 


QoZ/ow \ V 


lUUO 


1 >1 OO 

1482 


0 


z:4oUyO.O 


gooy/o 1 o 


IOTA 

lU iO 


1 yi"70 

14/9 


O 


O/l'JOOA A 


goyoo /Uo 


1 01 Q 

lU lo 


1479 


c 
O 


24o0yo.o 


g4o72oo2 


1022 


1487 


c 
O 


24 3096. 6 


yi 1 r\on^ Till 

4 1 0307 1 H i 


1030 


1104 


5 


243096.6 


347o526H I 


1032 


1183 


5 


243096.6 


~, A A A'J 

g4457447 


1045 


1481 


c 
O 


243096.O 


g4222324 


lU/O 


1480 


O 


243096.6 


63 i2u9R6 


1076 


1478 


5 


243096.6 


— Olio Ct^C 

g3l 18595 


1076 


1478 


5 


243096.6 


g2577165 


1076 


1480 


5 


243096.6 


g2 185952 


1078 


1493 


5 


243096.6 


g2063697 


1079 


1482 


5 


243096.6 


222052H1 


1079 


1215 


5 


243096.6 


222052F1 


1078 


1479 


rr 

5 


243096.6 


222u52R 1 


1078 


1479 


5 


243096.6 


g3737532 


1083 


1509 


c 
O 


243096.6 


1 849724T6 


1095 


1440 


5 


243096.6 


g4l 10131 


1 1 1 CT 

1 1 15 


1481 


O 


243096.6 


631209T6 


1 1 16 


1438 


c 
O 


243096.6 


24467 27T6 


T T T 

1119 


1438 


5 


o o r>oz z. 

243096.6 


go 1 55321 


1 1 29 


1475 


5 


243096.6 


g31781 76 


1 148 


1494 


5 


243096.6 


9481 77H1 


1 149 


1428 


5 


243096.6 


r\>ioT "7'^r>T 

9481 77R1 


1 149 


1488 


n 
O 


24oUy6.o 


T ri /I O 1 ~7ZLJ 1 

1942 1 76H1 


1 1 A O 

1 163 


1441 


O 


O/l QOOA A 

z4ouyo.o 


1 0>101 "7Ar)A 

1 y4z: 1 /Ol<6 


1 1 AQ 

1 loo 


1418 


o 


0/1 Qr\OA A 

z4oL)yo.o 


1 942 1 OoH 1 


1 1 AO 

1 loo 


1 A Ar\ 
1440 


C> 


z4ouyo.o 


OOC54 1 OOM 1 


1 1 Ai4 

\ i04 


l4oo 


0 


z4ouyo.o 


1 A 1 QPil ATA 

{ 4 1 OO 1 4 1 0 


T ono 
J 2uy 


1430 


o 




1/11 OP^I >1LJ 1 

1 4 i 00 1 4n 1 


1 Ol A 

I 2 lo 


l4o 1 


c 
D 




1 4 1 oo04n 1 


1 OT A 
1 Z JO 


l4oo 


c 
D 


A4ouyo.o 


}4 1 OD l4rO 


IOTA 

I 2 lo 


14/y 


c 
O 


z4ouyo.o 


oo2o43n 1 


1 2 to 


1458 


O 


zAouyo.o 


g** rU/ / 1 1 


1 ZzU 




5 


243096.6 


94457962 


1227 


1483 


5 


243096.6 


92837785 


1228 


1479 


5 


243096.6 


9819991 


1238 


1496 


5 


243096.6 


9564440 


1237 


1488 


5 


243096.6 


9816379 


1251 


1540 


5 


243096.6 


9885380 


1252 


1488 


5 


243096.6 


9768804 


1261 


1481 


5 


243096.6 


6093263H1 


1263 


1492 


5 


243096.6 


9645318 


1286 


1488 
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TABLE 4 



SEQ ID NO: 


Template ID 


Component ID 


Start 


Stop 


5 


243096.6 


g566867 


1292 


1526 


5 


243096.6 


g8163n 


1302 


1679 


5 


243096.6 


9671079 


1296 


1488 


5 


243096.6 


g2219072 


1297 


1472 


5 


243096.6 


g2539665 


1300 


1479 


5 


243096.6 


9670466 


1300 


1526 


5 


243096.6 


2474482H1 


1313 


1542 


5 


243096.6 


g4328047 


1314 


1487 


5 


243096.6 


g2205935 


1335 


1454 


5 


243096.6 


g832021 


1367 


1678 


5 


243096.6 


g2205936 


1375 


1472 


5 


243096.6 


92789365 


1393 


1479 


5 


243096.6 


g873007 


1418 


1540 


5 


243096.6 


g9CXXD98 


1419 


1488 


5 


243096.6 


g567639 


1424 


1667 


5 


243096.6 


1918362H1 


1473 


1731 


5 


243096.6 


4727611 HI 


1576 


1854 


5 


243096.6 


g822245 


1648 


1976 


5 


243096.6 


g8 12869 


1651 


2012 


5 


243096.6 


g830918 


1654 


2012 


5 


243096.6 


1414612H1 


1662 


1890 


5 


243096.6 


4761250H1 


1662 


1939 


5 


243096.6 


g678372 


1665 


1972 


5 


243096.6 


9561207 


1665 


1955 


5 


243096.6 


g2O02379 


1665 


2002 


5 


243096.6 


g709471 


1665 


1866 


5 


243096.6 


4761242H1 


1664 


1939 


5 


243096.6 


9518391 


1678 


1933 


5 


243096.6 


4595686H1 


1707 


1981 


5 


243096.6 


1511493H1 


1792 


1996 


5 


243096.6 


151 1493F6 


1792 


2244 


5 


243096.6 


1512376H1 


1792 


2009 


5 


243096.6 


92003356 


1922 


2087 


5 


243096.6 


4697285H1 


2203 


2446 


5 


243096.6 


4941432H1 


2381 


2673 


5 


243096.6 


1230891 HI 


2413 


2508 


5 


243096.6 


1522037H1 


2421 


2625 


5 


243096.6 


3749969H1 


2502 


2799 


5 


243096.6 


2125142H1 


2527 


2795 


5 


243096.6 


2125142F6 


2527 


2841 


5 


243096.6 


121562H1 


2620 


2806 


5 


243096.6 


856668H1 


2745 


2933 


5 


243096.6 


5882521 HI 


2758 


3030 


5 


243096.6 


5888582H1 


2759 


2976 


5 


243096.6 


5882569H1 


2760 


3030 


5 


243096.6 


9775350 


2793 


3137 


6 


243096.6 


9705857 


2790 


3138 


5 


243096.6 


92002380 


2803 


3138 


5 


243096.6 


5927949H1 


2845 


3140 


5 


243096.6 


1511493T6 


2883 


3600 



70 



wo 00/75298 




PCT/US00/1S344 



TABLE 4 



ID NO; 


Template ID 


Component ID 


Start 


Stop 


5 


243096.6 


1335311 HI 


2951 


3205 


5 


243096.6 


1613745H1 


2965 


3179 


5 


243096.6 


3472823H1 


3005 


3245 


5 


243096,6 


g570224 


3048 


3318 


5 


243096.6 


g4095588 


3134 


3545 


5 


243096.6 


g831152 


3143 


3366 


5 


243096.6 


g4286632 


3205 


3476 


5 


243096.6 


6907720H1 


3208 


3501 


5 


243096.6 


g4 187457 


3286 


3557 


5 


243096.6 


g3842315 


3295 


3465 


5 


243096.6 


g4005713 


3303 


3466 


5 


243096,6 


g40D6389 


3305 


3465 


5 


243096.6 


g4006377 


3305 


3559 


5 


243096.6 


94006150 


3305 


3537 


5 


243096,6 


g4006070 


3315 


3542 


5 


243096.6 


g4 187003 


3315 


3554 


6 


243096.6 


g4 188554 


3315 


3637 


5 


243096.6 


g4006771 


3315 


3537 


5 


243096.6 


g4072007 


3316 


3542 


5 


243096,6 


g4017934 


3316 


3537 


5 


243096.6 


g4 150328 


3316 


3465 


5 


243096,6 


g4005644 


3316 


3537 


5 


243096.6 


6840086H1 


3345 


3653 


5 


243096.6 


5289394H1 


3472 


3737 


5 


243096.6 


g710217 


3508 


3789 


6 


243096.6 


g694296 


3619 


3781 


5 


243096,6 


g2206232 


3623 


3794 


5 


243096.6 


g2206104 


3656 


3795 


5 


243096,6 


289721 5H1 


1 


249 


6 


243096,6 


3541808H1 


181 


397 


5 


243096,6 


2352032H1 


32 


249 


5 


243096.6 


2446727F6 


44 


104 


5 


243096,6 


3123367H1 


44 


356 


6 


243096.6 


4385825H1 


181 


379 


5 


243096.6 


2446727H1 


44 


308 


5 


243096.6 


2905666H1 


45 


326 


5 


243096.6 


276761 6H1 


46 


308 


5 


243096.6 


4521271H1 


214 


473 


5 


243096.6 


1726750H1 


47 


209 


5 


243096.6 


g 1965606 


235 


621 


5 


243096.6 


3n7919Hl 


47 


328 


5 


243096.6 


2762827H1 


49 


309 


6 " 


243096.6 


5395762H1 


245 


510 


5 


243096.6 


5585677H1 


251 


484 


5 


243096.6 


3416289H1 


253 


507 


6 


243096.6 


5407275H1 


257 


511 


5 


243096.6 


5407149H1 


257 


520 


5 


243096.6 


3452689H1 


49 


240 


5 


243096.6 


4819033H1 


292 


515 


5 


243096.6 


483458H1 


50 


302 
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TABLE 4 



SEQ ID NO: 


Template ID 


Component ID 


Start 


Stop 


5 


243096.6 


1941753H1 


291 


637 


5 


243096.6 


486741 HI 


50 


296 


5 


243096.6 


g 7667 17 


307 


480 


5 


243096.6 


221 5901 HI 


51 


147 


5 


243096.6 


2990593H1 


73 


383 


5 


243096.6 


4017755H1 


76 


378 


5 


243096.6 


25584 16H1 


81 


363 


5 


243096.6 


870507R1 


82 


682 


5 


243096.6 


870507H1 


82 


339 


5 


243096.6 


3692401 HI 


82 


285 


5 


243096.6 


2387055H1 


83 


336 


5 


243096.6 


1641267H1 


329 


555 


5 


243096.6 


2483682H1 


84 


331 


5 


243096.6 


4977750H1 


89 


382 


5 


243096.6 


g4244257 


343 


817 


5 


243096.6 


3165660H1 


89 


373 


5 


243096.6 


4043243H1 


89 


406 


5 


243096.6 


3500940H1 


347 


625 


5 


243096.6 


3580985H1 


91 


415 


5 


243096.6 


1518953F6 


93 


405 


5 


243096.6 


g672832 


92 


414 


5 


243096.6 


g574622 


92 


418 


5 


243096.6 


2206923H1 


355 


624 


5 


243096.6 


2681788H1 


92 


287 


5 


243096.6 


g672843 


92 


444 


5 


243096.6 


790680R1 


365 


938 


5 


243096.6 


790680H1 


365 


584 


5 


243096.6 


3500449H1 


386 


701 


5 


243096.6 


1518953H1 


92 


280 


5 


243096.6 


3510335H1 


92 


396 


5 


243096.6 


44G1B67H1 


393 


653 


5 


243096.6 


1624276H1 


395 


583 


5 


243096.6 


3099469H1 


92 


415 


6 


243096.6 


g873107 


93 


484 


5 


243096.6 


g874944 


93 


492 


5 


243096.6 


2201243H1 


411 


667 


5 


243096.6 


4907323H2 


97 


377 


5 


243096.6 


2215590H1 


421 


665 


5 


243096.6 


1647105H1 


102 


323 


5 


243096.6 


3337242H1 


105 


332 


5 


243096.6 


3328567H1 


421 


709 


5 


243096.6 


5165830H1 


113 


391 


5 


243096.6 


1919378R6 


432 


865 


5 


243096.6 


2078775H1 


114 


391 


5 


243096.6 


1919378H1 


432 


700 


5 


243096.6 


1798353H1 


115 


371 


5 


243096.6 


5109893H1 


447 


675 


5 


243096.6 


3581083H1 


116 


378 


5 


243096.6 


2202470H1 


456 


711 


5 


243096.6 


1642210H1 


461 


676 
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TABLE 4 



ID NO: 


Template ID 


Component ID 


Start 


Stop 


5 


243096 6 


28288 17H1 


122 


399 


5 


243096 6 


1642206H1 


461 


676 


5 


243096.6 


g669309 


136 


449 


5 


243096 6 


49326 16H1 


463 


618 


6 


243096 6 


a571269 


136 


492 


5 


243096 6 


1753890H1 


464 


690 


5 


243096 6 


75621 7H1 


464 


706 


5 


243096 6 


3030389H1 


464 


764 


5 


243096 6 




136 


462 


5 


243096 6 


1764027H1 


464 


704 


5 


243096 6 


3893562H1 


139 


449 


5 


243096 6 


a 885379 


139 


482 


5 


243096 6 


026083H1 


509 


692 


\j 


243096 6 


9768492H] 


143 


413 


5 


243096 6 


836506R1 


513 


1092 


\j 




52Q4950H1 


147 


395 


5 


243096 6 


836506H1 


513 


759 


5 






517 


685 






173031 HI 


157 


390 


\j 




51 97820H2 


161 


419 


\j 






522 


940 


5 


243096 6 




161 


395 


yj 






572 


854 




243096 6 




160 


502 


5 


243096 6 


28281 15T6 


574 


950 


\j 


243096 6 


n28 16446 


612 


874 




243096 6 


4536339H1 


624 


877 


5 


243096 6 


2670757 HI 


629 


868 


5 
»j 


243096 6 


4994228H1 


697 


1004 


ft 


943096 6 


793596H 1 


697 


946 


w 


243096 6 


a2058963 


697 


941 


5 


243096 6 


1560602H1 


719 


948 


5 


243096 6 


1535660H1 


719 


898 


5 


243096 6 


02058866 


730 


936 


5 


243096 6 


2316449H1 


764 


1045 


5 


243096 6 


686656H1 


771 


1028 


5 


243096 6 


3728525T1 


783 


1432 


5 


243096.6 


3669224H2 


789 


1073 


5 


243096 6 


63941 IHl 


798 


1050 


5 


243096 6 


5332267 HI 


820 


1063 


5 


243096 6 


2084254H1 


837 


1136 


5 


243096.6 


2652025T6 


859 


1424 


5 


243096.6 


961879T6 


859 


1437 


5 


243096.6 


50881 78T6 


858 


1465 


5 


243096.6 


1849724F6 


860 


1441 


5 


243096.6 


1849724H1 


860 


1135 


5 


243096.6 


5395762T1 


884 


1440 


5 


243096.6 


1919378T6 


880 


1452 


5 


243096.6 


2663785H1 


891 


1144 


5 


243096.6 


3625012H1 


904 


1051 
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TABLE 4 



SEQ ID NO: 


Template ID 


Component ID 


Start 


Stop 


5 


243096.6 


2683022H1 


931 


1212 


5 


243096.6 


3499439H1 


933 


1219 


5 


243096.6 


4005888H1 


936 


1211 


5 


243096.6 


1682469T7 


939 


1485 


5 


243096.6 


2351530H1 


941 


1130 


5 


243096.6 


g2063947 


951 


1207 


5 


243096.6 


1668866H1 


960 


1201 


5 


243096.6 


1667633H1 


960 


1220 


5 


243096.6 


g2358923 


963 


1060 


5 


243096.6 


3085780H1 


971 


1082 


5 


243096.6 


g814507 


161 


443 


5 


243096.6 


g830479 


161 


470 


5 


243096.6 


g816410 


162 


519 


6 


244366.6 


1889554H1 


493 


750 


6 


244366.6 


1 889554F6 


493 


939 


6 


244366.6 


4298324H1 


1 


255 


6 


244366.6 


853003H1 


12 


225 


6 


244366.6 


g21 78494 


517 


912 


6 


244366.6 


853003R6 


26 


483 


6 


244366.6 


3327565H1 


555 


787 


6 


244366.6 


2263295H1 


278 


532 


6 


244366.6 


5401026H1 


604 


816 


6 


244366.6 


1285225H1 


606 


862 


6 


244366.6 


26741 62H1 


661 


904 


6 


244366.6 


3101288H1 


295 


585 


6 


244366.6 


32951 39H1 


815 


1056 


6 


244366.6 


6002940H1 


886 


1170 


6 


244366.6 


3101288F6 


295 


694 


6 


244366.6 


6002740H1 


904 


1170 


6 


244366.6 


3246058F6 


941 


1282 


6 


244366.6 


3246058H1 


941 


1192 


6 


244366.6 


3887233H1 


959 


1240 


6 


244366.6 


2431320H1 


972 


1192 


6 


244366.6 


1513444H1 


978 


1189 


6 


244366.6 


2813740H1 


1071 


1363 


6 


244366.6 


2815664H1 


1071 


1274 


6 


244366.6 


2813707H1 


1071 


1359 


6 


244366.6 


3492628H1 


1138 


1414 


6 


244366.6 


2183893H1 


1190 


1460 


6 


244366,6 


56411 64H1 


1238 


1485 


6 


244366.6 


5a0082Hl 


1239 


1487 


6 


244366.6 


3165135H1 


1254 


1487 


6 


244366.6 


3075416H1 


1282 


1565 


6 


244366.6 


3559024H1 


1351 


1639 


6 


244366.6 


3451987H1 


1525 


1785 


6 


244366.6 


4378692H1 


1597 


1817 


6 


244366.6 


g21 62961 


1742 


2237 


6 


244366.6 


3890528H1 


1764 


1919 


6 


244366.6 


5017346H1 


3006 


3272 


6 


244366.6 


1690531 HI 


2938 


3105 
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TABLE 4 



in NO- 
IL/ IN^*-*. 


i C?! 1 IfJIU 1 ILy 


r^onnr^ririf^r^f IP) 


Start 




A 
\J 






3007 


31 33 


A 

\J 




4<S14'^'^'^H1 






A 

w 






9979 


393Q 


A 
LI 






3012 


3999 


A 

t? 








'^934 

OZOM 


A 










A 
U 


9zlzl'^AA A 




3031 

www 1 




A 
O 


OAA'^AA A 






oozo 


A 

O 


OAA'K^^ A 








A 
U 




Uw/O / z. / n 1 




^AC\C\ 
o*+uu 


A 

O 


OAA^f^f^ A 








A 
O 


OAA'\f^f^ A 








A 
O 


OAA'^/^A A 
ZA*IOOO,0 


vjwo 1 oon 1 


0U*40 


ooou 


A 


OAA'^^A A 




0U*4O 


'\AO*^ 
0*470 


O 




uuo / LJ 1 n 1 


OUmO 


'^/171 
0*4 / 1 


O 




UUOO I Orl 1 


0U*40 


O^Zt 


A 
O 




UUO 1 oon 1 


OL/'+O 


'^>1A3 
OmOO 


A 
O 




uuo 1 z./ri 1 


0U'40 


OAOo 


O 


O/l/lQAA A 


uuo-^oort 1 


ou^o 


0400 


O 


Oyi yiOAA A 




0U*40 


ooou 


O 


O/l/l^lAA A 
Z^^oOO.O 


coon A 1 7W 1 


qnno 

oUUZ 


oouu 


A 
O 


OAA'\A/\ A 


uuoo^ 1 n 1 


0U*40 


OUM'T 


A 
O 




UUozV**r1 1 


?n/i'^ 


OA 1 1 


A 
O 


O/ivioAA A 


nn A/1 9 w 1 
uuoo^zn 1 


ou^o 


0*4UU 


A 
O 


OAA'\A\A A 


nn*^ A/1 AW 1 


ou**o 




A 
O 


OAA'\/<A A 


uuoooun 1 


0\J*40 


oovz 


A 
O 




R^ '^ftm 9W1 
o I ooQ 1 zn 1 


OU/ 7 


OO 


A 
O 




no/iAn'^i-i 1 
uy^ouon i 


OUO 1 


ozuo 


A 
O 




i / ^oouzn 1 


Oil** 


OOOo 


A 
O 


OAA'\AA A 


^^0'\A1A1A 
\ OZO'^ / ^4 1 0 


O 1 zu 




A 
O 


OAA'\^f\ A 


zoo 1 ooun 1 


O 1 zz 


3 3 AO 

OOOU 


A 
o 


OAA'\fsS A 


7AA97'^1-n 
/ooz/ v^n 1 


O 1 OU 




A 


O/IZ'^AA A 

^*4*40 wvJ . LJ 








A 




'^7n^7^9Hl 


31 AT 


3533 


A 




^7A4*=iA^Hl 


31 7*^ 


3471 


A 




'^im 9AftTA 


3197 


3715 


A 
O 


9443AA A 


55A13a9Hl 


3201 


3503 


A 




(y7A7'^AW 
u/ *4/ o**ri 1 


3909 


3449 


A 
O 




U/ OOZzii 1 


^^909 

OjCU^ 


3403 


A 
O 




1 07^/1-4 ITA 


^^999 




A 
O 






39A4 


3in A 


6 


244366.6 


211252916 


3250 


3723 


6 


244366.6 


g3933445 


3274 


3766 


6 


244366.6 


4861989H1 


3274 


3567 


6 


244366.6 


1737024F6 


3292 


3734 


6 


244366,6 


92584374 


3285 


3757 


6 


244366.6 


1735490H1 


3292 


3561 


6 


244366.6 


1737024H1 


3292 


3554 


6 


244366.6 


393035H1 


3303 


3590 


6 


244366.6 


21 58031 F6 


3307 


3758 
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SEQ ID NO: 


Template ID 


Component ID 


Start 


Stop 


6 


244366.6 


g4438605 


3308 


3758 


6 


244366.6 


2654055H1 


3325 


3624 


6 


244366.6 


92934389 


3326 


3756 


6 


244366.6 


g4391414 


3341 


3756 


6 


244366.6 


4460645H1 


3349 


3601 


6 


244366.6 


92208607 


3359 


3756 


6 


244366.6 


92318342 


3366 


3756 


6 


244366.6 


91062585 


3370 


3743 


6 


244366,6 


94081771 


3372 


3761 


6 


244366.6 


91925212 


3381 


3757 


6 


244366.6 


4726758H1 


3390 


3662 


6 


244366.6 


92410378 


3392 


3763 


6 


244366.6 


8674 19H1 


3398 


3679 


6 


244366.6 


9616070 


3402 


3764 


6 


244366.6 


92163418 


3405 


3756 


6 


244366.6 


92555602 


3413 


3760 


6 


244366.6 


9661365 


3428 


3766 


6 


244366.6 


1 889554T6 


3433 


3725 


6 


244366.6 


92336915 


3444 


3766 


6 


244366.6 


g6161 15 


3445 


3756 


6 


244366.6 


94435130 


3469 


3766 


6 


244366.6 


94525507 


3602 


3757 


6 


244366.6 


2158031H1 


3523 


3766 


6 


244366.6 


92401624 


3528 


3765 


6 


244366.6 


94268526 


3559 


3766 


6 


244366.6 


2009370H1 


3666 


3756 


6 


244366.6 


218073H1 


3682 


3756 


6 


244366.6 


2350763H1 


3699 


3760 


6 


244366.6 


1647483H1 


2520 


2769 


6 


244366.6 


2432662H1 


2526 


2757 


6 


244366.6 


901 941 R1 


2538 


3096 


6 


244366.6 


901941 HI 


2538 


2895 


6 


244366.6 


901981 HI 


2538 


2858 


6 


244366.6 


2052263H1 


2545 


2857 


6 


244366.6 


3483573H1 


2553 


2851 


6 


244366.6 


3565807H1 


2558 


2822 


6 


244366.6 


92278841 


2560 


2917 


6 


244366.6 


92178439 


2563 


2917 


6 


244366.6 


92153824 


2565 


2917 


6 


244366.6 


91329145 


2568 


2874 


6 


244366.6 


219851 5H1 


2647 


2908 


6 


244366.6 


2200581 HI 


2647 


2725 


6 


244366.6 


91548506 


2680 


3207 


6 


244366.6 


232 n85Hl 


2694 


2917 


6 


244366.6 


324407 1T6 


2568 


2798 


6 


244366.6 


2936492H1 


2700 


2917 


6 


244366.6 


600642H1 


2586 


2891 


6 


244366.6 


91124072 


2711 


2850 


6 


244366.6 


91833465 


2712 


2856 


6 


244366.6 


94327019 


2736 


2861 
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ID NO: 


Template ID 


Component ID 


Start 


Stop 


6 


244366.6 


3165778H1 


2588 


2925 


6 


244366.6 


g21 39164 


2594 


2835 


6 


244366.6 


633565H1 


2753 


2917 


6 


244366.6 


1620269F6 


2766 


3147 


6 


244366.6 


1520026H1 


2766 


2917 


6 


244366.6 


1520269H1 


2766 


2917 


6 


244366.6 


94095144 


2611 


2917 


6 


244366.6 


1237708H1 


2767 


3016 


6 


244366.6 


276841 IHl 


2770 


3013 


6 


244366.6 


gl321416 


2615 


2858 


6 


244366.6 


2416809H1 


2800 


2917 


6 


244366.6 


1570846H1 


2811 


3018 


6 


244366.6 


38981 14H1 


2621 


2859 


6 


244366.6 


874422H1 


2824 


3131 


6 


244366.6 


g 1349372 


2825 


2952 


6 


244366.6 


2815235H1 


2842 


3116 


6 


244366.6 


4897079H1 


2925 


3201 


6 


244366.6 


1231478H1 


2632 


2861 


6 


244366.6 


2152887H1 


2927 


3042 


6 


244366.6 


3510024H1 


2635 


2917 


6 


244366.6 


1 97544 1F6 


2938 


3306 


6 


244366.6 


2189606H1 


2938 


3203 


6 


244366.6 


1975441 HI 


2938 


3088 


6 


244366.6 


3470739H1 


2938 


3153 


6 


244366.6 


g 1844965 


2643 


2917 


6 


244366.6 


1623474H1 


2155 


2382 


6 


244366.6 


1338343H1 


1799 


2055 


6 


244366.6 


2022520H1 


2157 


2423 


6 


244366.6 


8051 31 HI 


2200 


2397 


6 


244366.6 


795024H1 


2202 


2393 


6 


244366.6 


1338343F6 


1799 


2241 


6 


244366.6 


1297158H1 


1810 


2050 


6 


244366.6 


3354386H1 


2223 


2491 


6 


244366.6 


2540345H1 


2224 


2461 


6 


244366.6 


2313137H1 


1887 


2152 


6 


244366.6 


2805024H1 


2233 


2536 


6 


244366.6 


2454581T6 


2281 


2827 


6 


244366.6 


9570404 


1961 


2245 


6 


244366.6 


2773281 HI 


2282 


2528 


6 


244366.6 


3246058T6 


2284 


2807 


6 


244366.6 


3332425T6 


2286 


2817 


6 


244366.6 


3321733H1 


1988 


2109 


6 


244366.6 


g21 53937 


2324 


2754 


6 


244366.6 


1464642H1 


1995 


2224 


6 


244366.6 


gl319564 


2331 


2938 


6 


244366.6 


3555988H1 


2005 


2304 


6 


244366.6 


3384030H1 


2043 


2317 


6 


244366.6 


g 1898463 


2332 


2760 


6 


244366.6 


g 1062443 


2337 


2746 


6 


244366.6 


3188982H1 


2348 


2686 
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SEQ ID NO: 


Template ID 


Component ID 


Start 


Stop 


6 


244366.6 


2255759H1 


2045 


2318 


6 


244366.6 


21 12529H1 


2355 


2621 


6 


244366.6 


3693731 HI 


2364 


2662 


6 


244366.6 


1864803H1 


2069 


2351 


6 


244366.6 


853003T6 


2385 


2815 


6 


244366.6 


g 1925211 


2137 


2615 


6 


244366.6 


2641982H1 


2385 


2598 


6 


244366.6 


2516658H1 


2397 


2535 


6 


244366.6 


1864803T6 


2414 


2898 


6 


244366.6 


1338343T6 


2421 


2814 


6 


244366.6 


1398258H1 


2440 


2681 


6 


244366.6 


1829874H1 


2471 


2738 


6 


244366.6 


5891 37H1 


2478 


2685 


6 


244366.6 


4855638H1 


2147 


2411 


6 


244366.6 


1698592H1 


2500 


2726 


6 


244366.6 


3524562H1 


2154 


2348 


6 


244366.6 


gl319504 


2523 


2952 


6 


244366.6 


1623474F6 


2155 


2682 


7 


405313.4 


4640462H1 


573 


837 


7 


405313.4 


g 1774849 


595 


979 


7 


405313.4 


4721077H1 


54 


194 


7 


405313.4 


5944975H1 


61 


370 


7 


405313.4 


1948647H1 


596 


828 


7 


405313.4 


1 59201 6H1 


86 


282 


7 


405313.4 


1948647R6 


596 


1136 


7 


405313.4 


94070751 


686 


1137 


7 


405313.4 


2384959H1 


86 


263 


7 


405313.4 


4571373H1 


715 


978 


7 


405313.4 


g954058 


893 


1203 


7 


405313.4 


1559555H1 


903 


1120 


7 


405313.4 


1559555F6 


903 


1363 


7 


405313.4 


9617633 


914 


1316 


7 


405313.4 


1302977H1 


86 


257 


7 


405313.4 


4215272H1 


919 


1195 


7 


405313.4 


g 1492868 


88 


230 


7 


405313.4 


464381 5H1 


962 


1212 


7 


405313.4 


965308H1 


968 


1255 


7 


405313.4 


965308R1 


968 


1622 


7 


405313.4 


1321926T6 


132 


483 


7 


405313.4 


5136028H1 


993 


1266 


7 


405313.4 


g4264253 


155 


609 


7 


405313.4 


g3739298 


156 


610 


7' 


405313.4 


4306178H1 


1008 


1207 


7 


405313.4 


g 1237752 


1009 


1175 


7 


405313.4 


4551446H1 


1047 


1310 


7 


405313.4 


g4522654 


210 


522 


7 


405313.4 


1628853H1 


1049 


1219 


7 


405313.4 


1627193H1 


1049 


1261 


7 


405313.4 


1316291H1 


349 


522 


7 


405313.4 


1628853F6 


1049 


1649 
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SEQ ID NO: 


Template ID 


Component ID 


Start 


Stop 


7 


405313.4 


1283759H1 


1053 


1330 


7 


405313.4 


4312209H1 


1067 


1367 


7 


405313.4 


4103086H1 


1115 


1239 


7 


405313.4 


g 1984348 


1147 


1472 


7 


405313.4 


g2540589 


1168 


1616 


7 


405313.4 


g 1492809 


369 


527 


7 


405313.4 


3584223H1 


1177 


1354 


7 


405313.4 


102569H1 


458 


607 


7 


405313,4 


4829437H1 


540 


737 


7 


405313.4 


3674811 HI 


1186 


1495 


7 


405313.4 


1733325H1 


1192 


1412 


7 


405313.4 


g 1984560 


561 


804 


7 


405313.4 


g2010449 


2002 


2335 


7 


405313.4 


2682711 HI 


2014 


2309 


7 


405313.4 


g4535531 


2016 


2337 


7 


405313.4 


g 1773873 


2022 


2340 


7 


405313.4 


g4137010 


2025 


2346 


7 


405313.4 


4111488H1 


2026 


2288 


7 


405313.4 


2153069H1 


2034 


2315 


7 


405313.4 


870657R1 


2041 


2613 


7 


405313.4 


870657H1 


2041 


2250 


7 


405313.4 


5433876H1 


2063 


2317 


7 


405313.4 


g3 162264 


2073 


2338 


7 


405313.4 


g3057393 


2080 


2337 


7 


405313.4 


g3872586 


2081 


2335 


7 


405313.4 


2081907T6 


2084 


2288 


7 


405313.4 


659926H1 


2097 


2337 


7 


405313.4 


056422H1 


2097 


2317 


7 


405313.4 


3486469H1 


2105 


2337 


7 


405313.4 


81 7086R1 


2127 


2337 


7 


405313.4 


817086H1 


2127 


2388 


7 


405313.4 


817086T1 


2127 


2280 


7 


405313.4 


g3597649 


2142 


2335 


7 


405313.4 


3988965H1 


2205 


2499 


7 


405313.4 


2664989H1 


1633 


1850 


7 


405313.4 


39621 92H1 


1646 


1776 


7 


405313.4 


2290557H1 


1646 


1921 


7 


405313.4 


4115475H1 


1647 


1861 


7 


405313.4 


g670108 


1649 


1960 


7 


405313.4 


g570685 


1650 


1964 


7 


405313.4 


2213032H1 


1653 


1919 


7 


405313.4 


1667188H1 


1653 


1775 


7 


405313.4 


2285586H1 


1653 


1841 


7 


405313.4 


990792H1 


1669 


1968 


7 


405313,4 


1283707T6 


1682 


2306 


7 


405313.4 


3781966H1 


1686 


2020 


7 


406313.4 


2402093H1 


1700 


1946 


7 


405313.4 


3666248H1 


1707 


1866 


7 


405313.4 


1948647T6 


1711 


2289 


7 


405313.4 


5681549H1 


1746 


2012 
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SEQ ID NO: 


Template ID 


Component ID 


Start 


Stop 


7 


405313.4 


2885455T6 


1752 


2292 


7 


405313.4 


1301039H1 


1790 


2095 


7 


405313.4 


1669180T6 


1799 


2303 


7 


405313.4 


1669180H1 


1808 


2038 


7 


405313.4 


3436329H1 


1815 


1956 


7 


405313.4 


g 1624740 


1828 


2218 


7 


406313.4 


1569555T6 


1831 


2302 


7 


405313.4 


3106278H1 


1846 


2121 


7 


405313.4 


2323323T6 


1848 


2308 


7 


405313.4 


1914641H1 


1848 


2062 


7 


405313.4 


1628853T6 


1854 


2300 


7 


405313.4 


g2946487 


1855 


2335 


7 


405313.4 


2081907F6 


1854 


2313 


7 


405313.4 


2081917H1 


1854 


2015 


7 


405313,4 


9672957 


1869 


2199 


7 


405313.4 


29161 79H1 


1878 


2177 


7 


405313.4 


3556793T6 


1881 


2306 


7 


405313.4 


92783325 


1885 


2345 


7 


405313.4 


1268187F1 


1888 


2344 


7 


405313.4 


12681 87H1 


1888 


2152 


7 


405313.4 


1268187F6 


1888 


2203 


7 


405313.4 


1268187T6 


1890 


2317 


7 


405313.4 


g6 16527 


1892 


2244 


7 


405313.4 


93593850 


1896 


2335 


7 


405313.4 


9573001 


1897 


2264 


7 


405313.4 


9815353 


1898 


2253 


7 


405313.4 


g4083770 


1902 


2335 


7 


405313.4 


94281927 


1912 


2337 


7 


405313.4 


g2753877 


1930 


2345 


7 


405313.4 


2199211 HI 


1930 


2188 


7 


405313.4 


2375908H1 


1934 


2181 


7 


405313.4 


g3797979 


1954 


2337 


7 


405313.4 


26633 12H1 


1963 


2221 


7 


405313.4 


94265408 


1967 


2342 


7 


405313.4 


93919084 


1969 


2335 


7 


405313.4 


94085496 


1970 


2334 


7 


405313.4 


g668546 


1976 


2156 


7 


405313.4 


2149388H1 


1985 


2274 


7 


405313.4 


2601556H1 


1995 


2280 


7 


405313.4 


92789326 


2898 


3179 


7 


405313.4 


9646138 


2913 


3179 


7 


405313.4 


988868O 


2916 


3206 


7 


405313.4 


9646137 


2924 


3179 


7 


405313.4 


9645108 


2927 


3179 


7 


405313.4 


9917579 


2933 


3178 


7 


405313.4 


93051580 


2943 


3179 


7 


405313.4 


93764150 


2943 


3179 


7 


405313.4 


92903435 


2947 


3179 


7 


405313.4 


91225232 


2947 


3179 


7 


405313.4 


91087854 


2947 


3111 
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ID NO: 


Template ID 


Component ID 


Start 


Stop 


7 


405313.4 


gl647814 


2998 


3213 


7 


405313.4 


g917686 


3058 


3210 


7 


405313.4 


1 226453T1 


3087 


3168 


7 


405313.4 


1226453H1 


3087 


3167 


7 


405313.4 


g31 74481 


3089 


3178 


7 


405313.4 


g646728 


2878 


3179 


7 


405313.4 


g884856 


2884 


3211 


7 


405313.4 


g815354 


2895 


3216 


7 


405313.4 


2714769H1 


2231 


2345 


7 


405313.4 


1440914H1 


2252 


2421 


7 


405313.4 


1 44091 4F6 


2252 


2675 


7 


405313.4 


4311940H1 


2263 


2546 


7 


405313.4 


6073069H1 


2266 


2498 


7 


405313.4 


549605H1 


2285 


2554 


7 


405313.4 


2157285H1 


2287 


2523 


7 


405313.4 


23231 79H1 


2324 


2460 


7 


405313.4 


2323108H1 


2324 


2581 


7 


405313.4 


g2 197270 


2337 


2696 


7 


405313.4 


2294826H1 


2444 


2617 


7 


405313.4 


g 1043992 


2444 


2683 


7 


405313.4 


38601 IQHl 


2444 


2682 


7 


405313.4 


3246952H1 


2472 


2725 


7 


405313.4 


1956178H1 


2480 


2773 


7 


405313.4 


4312886H1 


2495 


2794 


7 


405313.4 


g770052 


2623 


2930 


7 


405313.4 


1415784H1 


2660 


2922 


7 


405313.4 


g884865 


2666 


3025 


7 


405313.4 


2014171H1 


2668 


2943 


7 


405313.4 


g888679 


2667 


3021 


7 


405313,4 


2224791 HI 


2686 


2955 


7 


405313.4 


4106915H1 


2695 


2996 


7 


405313.4 


92217789 


2772 


3181 


7 


405313.4 


g2874275 


2772 


3179 


7 


405313.4 


1 44091 4R1 


2773 


3179 


7 


405313.4 


g4Q75424 


2775 


3179 


7 


405313.4 


g4328896 


2776 


3179 


7 


405313.4 


287748H1 


2778 


3143 


7 


405313.4 


3496950H1 


2802 


3087 


7 


405313.4 


2750652H1 


2806 


3105 


7 


405313.4 


g765774 


2820 


3182 


7 


405313.4 


93895056 


2831 


3179 


7 


405313.4 


g4450984 


2833 


3179 


7 


405313.4 


92099917 


2853 


3348 


7 


405313,4 


92018248 


2860 


2990 


7 


405313.4 


9564562 


2869 


3179 


7 


405313,4 


g2459191 


2869 


3206 


7 


405313.4 


91099005 


2521 


2798 


7 


405313.4 


2731146H1 


2559 


2794 


7 


405313.4 


gl 198836 


2578 


2840 


7 


405313.4 


2229784H1 


2603 


2852 
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TABLE 4 



ID NO: 


Template ID 


Component ID 


Start 


Stop 


7 


405313.4 


2229548H1 


2603 


2858 


7 


405313.4 


1321926H1 


1 


235 


7 


405313.4 


1321926F6 


1 


388 


7 


405313.4 


1337104H1 


8 


265 


7 


405313.4 


268211 3H1 


8 


281 


7 


405313.4 


g2754255 


1191 


1654 


7 


405313.4 


4641436H1 


1206 


1470 


7 


405313.4 


g2558152 


1209 


1673 


7 


405313.4 


g2540638 


1218 


1616 


7 


405313.4 


1282G95H1 


1225 


1484 


7 


405313.4 


1283707H1 


1225 


1511 


7 


405313.4 


1282048H1 


1225 


1506 


7 


405313.4 


1283707F6 


1226 


1659 


7 


405313.4 


3280293H1 


1234 


1511 


7 


405313.4 


5398765H1 


1241 


1391 


7 


405313.4 


50721 15H1 


1293 


1556 


7 


405313.4 


5219645H1 


1319 


1443 


7 


405313.4 


1893001 HI 


1317 


1616 


7 


405313.4 


3696826H1 


1319 


1614 


7 


405313.4 


5920192H1 


1319 


1647 


7 


405313.4 


g3770(XI3 


1322 


1609 


7 


405313.4 


5665259H1 


1327 


1565 


7 


405313.4 


3151495H1 


1326 


1625 


7 


405313.4 


3357256H2 


1345 


1500 


7 


405313.4 


692886H1 


1353 


1604 


7 


405313.4 


540Q503H1 


1388 


1623 


7 


405313.4 


2323323H1 


1395 


1657 


7 


405313.4 


2323323R6 


1395 


1895 


7 


405313.4 


462231 2H1 


1404 


1718 


7 


405313.4 


5373496H1 


1438 


1700 


7 


405313.4 


6241 20H1 


1456 


1728 


7 


405313.4 


2535460H1 


1469 


1750 


7 


405313.4 


2115405H1 


1505 


1801 


7 


405313.4 


g954059 


1516 


1748 


7 


405313.4 


2292561 HI 


1538 


1790 


7 


405313.4 


2556287H1 


1580 


1842 


7 


405313.4 


g866975 


1610 


1953 


7 


405313.4 


8619nRl 


1619 


2200 


7 


405313.4 


86191 IHl 


1619 


1876 


7 


405313.4 


g873285 


1626 


2006 


8 


436857.2 


232864F1 


1467 


1959 


8 


436857.2 


2704880T6 


1506 


1925 


8 


436857.2 


g4373224 


1522 


1968 


8 


436857.2 


g4 112872 


1522 


1954 


8 


436857.2 


g4390509 


1556 


1962 


8 


436857.2 


5858881 HI 


1624 


1888 


8 


436657.2 


5267222H1 


1663 


1883 


8 


436857.2 


1477850T6 


1680 


2225 


8 


436857.2 


4617960T6 


1761 


2215 


8 


436857.2 


g917696 


1901 


2223 
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TABLE 4 



SEQ ID NO: 


Template ID 


Component ID 


Start 


Stop 


8 


436857,2 


32701 04H1 


1906 


2161 


8 


436857.2 


5782044H1 


2003 


2209 


8 


436857.2 


g4268407 


2010 


2240 


8 


436857.2 


g3051680 


2063 


2241 


8 


436857.2 


481468H1 


2063 


2250 


8 


436857.2 


56831 78H1 


1 


212 


8 


436857.2 


1477850H1 


56 


278 


8 


436857.2 


1477850F6 


56 


488 


8 


436857.2 


4619212H1 


138 


290 


8 


436867.2 


g 1978924 


207 


556 


8 


436857.2 


3758509H1 


206 


498 


8 


436857.2 


4617960H1 


323 


550 


8 


436857.2 


471 8961 HI 


323 


555 


8 


436857.2 


461 7960F6 


323 


761 


8 


436857.2 


1992924H1 


366 


651 


8 


436857 2 


g4269060 


528 


627 


8 


436857 2 


4255690H1 


599 


830 


8 


436857.2 


4761770H1 


727 


1004 


8 


436857.2 


4613106H1 


796 


1034 


8 


436867 2 


g2000739 


795 


1025 


8 


436857.2 


2704880H1 


883 


1176 


8 


436857.2 


2704880F6 


883 


1313 


8 


436857.2 


2707669H1 


961 


1264 


8 


436857.2 


805609H1 


1054 


1283 


8 


436857.2 


805609R1 


1054 


1630 


8 


436857.2 


4135963H1 


1113 


1415 


8 


436857.2 


4294249H1 


1152 


1398 


8 


436857.2 


5450335H1 


1156 


1421 


8 


436857.2 


5373082H1 


1203 


1419 


8 


436857.2 


4190554H1 


1247 


1412 
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CLAIMS 

What is claimed is: 

5 L An isolated polynucleotide comprising a polynucleotide sequence selected from the group 

consisting of: 

a) a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-14, 

b) a naturally occurring polynucleotide sequence having at least 90% sequence identity to a 
polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-14, 

10 c) a polynucleotide sequence complementary to a), 

d) a polynucleotide sequence complementary to b), and 

e) an RNA equivalent of a) through d). 

2. An isolated polynucleotide of claim 1, comprising a polynucleotide sequence selected 
15 from the group consisting of SEQ ID NO: 1-14. 

3. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
polynucleotide of claim L 

20 4. A composition for the detection of expression of disease detection and treatment molecule 

polynucleotides comprising at least one of the polynucleotides of claim 1 and a detectable label. 

5. A method for detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 1, the method comprising: 
25 a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction 

amplification, and 

b) detecting the presence or absence of said amplified target polynucleotide or fragment 
thereof, and, optionally, if present, the amount thereof. 

30 6. A method for detecting a target polynucleotide in a sample, said target polynucleotide 

comprising a sequence of a polynucleotide of claim 1, the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in the sample, and which probe 
specifically hybridizes to said target polynucleotide, under conditions whereby a hybridizadon 

35 complex is formed between said probe and said target polynucleotide, and 
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b) detecting the presence or absence of said hybridization complex^ and* optionally, if 
present, the amount thereof. 

7. A method of claim 5, wherein the probe comprises at least 30 contiguous nucleotides. 

5 

8. A method of claim 5, wherein the probe comprises at least 60 contiguous nucleotides. 

9. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 1 . 

10 

10. A cell transformed with a recombinant polynucleotide of claim 9. 

1 1. A transgenic organism comprising a recombinant polynucleotide of claim 9. 

15 12. A method for producing a disease detection and treatment molecule polypeptide, the 

method comprising: 

a) culturing a ceil under conditions suitable for expression of the disease detection and 
treatment molecule polypeptide, wherein said cell is transformed with a recombinant polynucleotide 
of claim 9, and 

20 b) recovering the disease detection and treatment molecule polypeptide so expressed, 

13. A purified disease detection and treatment molecule polypeptide (MDDT) encoded by at 
least one of the polynucleotides of claim 2, 

25 14- An isolated antibody which specifically binds to a disease detection and treatment 

molecule polypeptide of claim 13. 

15. A method of identifying a test compound which specifically binds to the disease 
detection and treatment molecule polypeptide of claim 13, the method comprising the steps of: 
30 a) providing a test compound; 

b) combining the disease detecuon and treatment molecule polypeptide with the test 
compound for a sufficient time and under suitable conditions for binding; and 

c) detecting binding of the disease detection and treatment molecule polypeptide to the 
test compound, thereby identifying the test compound which specifically binds the disease detection 

35 and treatment molecule polypeptide. 



84 



wo 00/75298 




PCT/USOO/15344 



16. A microarray wherein at least one element of the microarray is a polynucleotide of claim 

3. 

17. A method for generating a transcript image of a sample which contains polynucleotides, 
5 the method comprising the steps of: 

a) labeling the polynucleotides of the sample, 

b) contacting the elements of the microarray of claim 16 with the labeled polynucleotides of 
the sample under conditions suitable for the formation of a hybridization complex, and 

c) quantifying the expression of the polynucleotides in the sample. 

10 

18. A method for screening a compound for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a polynucleotide sequence of claim 1, 
the method comprising: 

a) exposing a sample comprising the target polynucleotide to a compound, and 
15 b) detecting altered expression of the target polynucleotide. 

19. A method of claim 6 for toxicity testing of a compound, further comprising 

(c) comparing the presence, absence or amount of said target polynucleotide in a first 
biological sample and a second biological sample* wherein said first biological sample has been 
20 contacted with said compound, and said second sample is a control, whereby a change in presence, 
absence or amount of said target polynucleotide in said first sample, as compared with said second 
sample, is indicative of toxic response to said compound. 
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SEQUENCE LISTING 



<110> INCYTE GENOMICS, INC, 
Hodgson, David M. 
Lincoln, Stephen E. 
Russo, Frank D. 
Spiro, Peter A. 
Banville, Steven C. 
Bratcher, Shawn R. 
Dufour, Gerard E. 
Cohen, Howard J. 
Rosen, Bruce H. 
Chalup, Michael S. 
Hillman, Jennifer L, 
Jones, Anissa L. 
Yu, Jimmy Y. 
Greenawalt, Lila B. 
Panzer, Scott 
Roseberry, Ann M. 
Wright, Rachel J. 
Daniels, Susan E- 



<12 0> MOLECULES FOR DISEASE DETECTION AND TREATMENT 



<130> PT-1042 PCT 



<140> To Be Assigned 



<141> Herewith 

<150> US 60/137,412; 60/147,500; 60/147,501; 60/147,542 
<151> 1999-06-03; 1999-08-05; 1999-08-05; 1999-08-05 



<160> 14 

<17 0> PERL Program 

<210> 1 

<211> 3101 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 222197.6 

<220> 

<221> unsure 

<222> 3077, 3084, 3093, 3097-3098 
<223> a, t, c, g, or other 



<400> 1 

agtgattgca tgagcttagg gaggggagtg acatgatctg atttacgtct gtggaagacc. 60 
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actctgggtg 
aggactggtc 
ccttaagatc 
gggtgcgtca 
cctctcctgg 
gtggctgacc 
tggcttctgg 
gacttctggt 
tcatcccacc 
aaagaataca 
tgctgctgta 
aaaatggatc 
tttgtgctct 
tttcagttca 
ataactgtaa 
gcagttatgt 
ttgaaaagtg 
tttggggggc 
ctgcccacga 
actgaaactt 
catctgtgac 
ttttatatat 
caacggaaag 
ggcttctgtg 
tgctgtttcc 
catttcaggg 
ggcgtttaca 
caaaagatct 
agagacatcc 
cgtgtgtgca 
gtgtacagca 
gcggggtcct 
gaacagccgt 
tgagtgcttg 
tataaatggc 
cccgatgctg 
gggcacctct 
gccaccgcag 
tttggttagc 
ggtgcggact 
aacttgcctg 
tgagggtctt 
aattgttttg 
catccgaatc 
accgtcctag 
aagcgtgtgc 
tcctggagac 
ttggacatgc 
aacaattatc 
cttgactttt 
cagttctcaa 



ctgcatgggg 
agcaggtgat 
ctgaagaaac 
gggaaatcat 
ctgaaaatga 
gggtctggtt 
tcgcctatgc 
actctgtggt 
tgagaaccat 
tggagagctt 
ttaaacccga 
atcactgccc 
tcactatgta 
tctcctgtgt 
tcctgttgat 
ttggcaccca 
agaagcccac 
ccccctcact 
gacccagaaa 
gctcacagac 
caacagggca 
ttatagtcac 
gtgtgtggcc 
gagaatactt 
aatcatgaag 
ggctcctgct 
tagaaagacg 
gtgcactgaa 
tttgaccctc 
tgtgtgtcaa 
aacaagctat 
gcccgtggtt 
tcctgtgcgg 
agggccttgg 
acctaggtaa 
tgtgtggggc 
gggaagggga 
gcagcgctcc 
ttttacgttt 
ttaaaattat 
gtttgtacat 
cctttatgct 
cctttctgtt 
agggcttcta 
ccaagcgagc 
gccgttigggg 
gccagtttcc 
gtgtgggctt 
caagtggttg 
caacctcttc 
aaaaaanaaa 



gactggactg 
ctaagcactc 
ggcacaaaat 
gcagccatca 
caactatgac 
catccgtgac 
agacttcgtg 
caacggggtc 
gctcaccgac 
gcagctgaag 
gcgcgcccac 
gtgggtgaac 
tatagctctg 
ccgagggcag 
cttcctgtgc 
aatccactcc 
atgggagcgg 
cctctggatg 
aggtggcccg 
ttccagttat: 
actggaacct 
agatggcaga 
acacgaagaa 
cgggttatta 
aaaaacagtg 
gaccccgcca 
ttttggtctc 
cagtgaaggt: 
tcagcaagtc 
aattgccagt 
tttttagaaa 
actatgaatg 
cccttcgttg 
aactgatttt 
gagcagagct 
aggggaggca 
ggggaccatg 
agtccgggaa 
tcttctccac 
gccagaaagc 
tttttgccgg 
tgccctccac 
catctgtgaa 
ccactgctga 
aaacctgcag 
gaagagctgc 
gagattgttc 
cagtgtgagg 
aatcctgtga 
tttcaatgta 
gggnggccgc 



ttgggtgagc 
cccagatccg 
gttcaagtga 
ggacacaggc 
tcttcatcgt 
ggctgcggca 
gtgactttcg 
atctttaac t 
cctggggcag 
cccggggaag 
cactgcagta 
aattgtgtag 
tcttcagtcc 
tggactgaat 
cttgagggtc 
atatgcaacg 
aggctgcgat 
aatccctttg 
gagtt-ctcag 
ttatttgggg 
acacaaacca 
ggaagaggct 
gccaaacgcc 
catgggttat 
aatccagtga 
ctcagcagtg 
gattagctcc 
ggcttccggc 
tgtgtgtgtg 
gttgtttagg 
ccgacgtttc 
tattgctgt-t 
ccctcctgct 
tttttttttg 
gcggctcggt 
tccttactgg 
aggcagccag 
tggccaggat 
ccacggcaca 
caacagctcc 
acgcatcaag 
actaagagaa 
ctgttttttg 
tgcaaaacca 
ggggtttgga 
gtcacagcca 
tgcatattca 
cttttaatat 
gacttggcaa 
acttttatat 
cgnctannga 



agaggcatga 
atcacatagg 
tgtttagaaa 
tccgggacgt 
cctcctcctc 
tgatctgtgc 
tcatgctgct 
gcttggccgt 
tacccaaagg 
tcatctacaa 
tttgcaaaag 
gagaaaagaa 
atgctctgat 
gcagtgattt 
ttctgttttt 
acgagacgga 
gggaagggat 

tgggcttccg 

tgtgaggcgt 
tctgaaggat 
attgcttgca 
ctcagtcccc 
gtggcctcct 
tcaaatcctg 
acagggattc 
cactccccgg 
gatgctttgc 
acactccccg 
cgtgtctgtg 
caatgtaaca 
agggaagagg 
ggaggacatc 
ttcatttttt 
ttccagccaa 
gacttgatac 
agaggcaggg 
cccctggcag 
ggcgccctct 
ggtgataaaa 
cctcgtgggg 
aagcaatctg 
gttggcgtct 
tttttaatta 
caaagggacc 
agtggacttg 
gagggacaaa 
tttgcacatt 
gtatatcctg 
gtgtgtgcaa 
gaaataaagt 



gagtagagag 
acagtatgca 
taacttgtga 
cgagcatcat 
cgaggctgac 
tgtcatgacg 
gccttccaaa 
gcttgccctg 
aaacgctacg 
gtgccccaag 
atgtattcgg 
tcaaagattt 
cctttgtgga 
ttcacctccg 
cactttcact 
gatcgagcga 
gaagtccgtc 
atttaggcga 
ggctcatcag 
atcaacagct 
gcaagcagag 
acctgtacaa 
gcagagctgg 
ggtcctgagc 
tccaagcagt 
atcacagcag 
gctgaagttg 
ctgccccgga 
cgtgtgcgcg 
tttaccggct 
ggagagagcc 
tcgatccaaa 
aaagaaatct 
attagcagtg 
ttggggcagc 
cccagccatt 
gggcgactgt 
tgttggagtt 
taggatcctt 
ccttgcctta 
tgacaaagtc 
ccctcctggg 
ctctgtaccc 
tacctgagcc 
gtcaccgcag 
gtgtgggtga 
gttgtctggg 
ttatcaataa 
atcaagtata 
aatcaattaa 



120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3101 



<210> 2 

<211> 2561 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> inisc_f eature 

<223> Incyte ID No: 227709.3 
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<220> 

<221> unsure 
<222> 126, 2144 
<223> a, t, c, g. 



or other 



<400> 2 

gcgggcgcgt 

gacgtggatc 

tgtgcngatt 

tgcattcgat 

gttaacctcg 

aaggcaaacc 

ccagctgttg 

ctggacatca 

gttccagaaa 

gaagacccac 

ttgggccttg 

gagaaggatt 

gttgtaggtt 

tttccggagc 

attctttcac 

cccgctcaga 

agcataatgg 

gggatggttg 

atgggtgtgc 

gctatgcccc 

cagatgaccc 

ggacagtcaa 

tggaaataaa 

ttccacagcc 

attgctcaat 

cctctctgtt 

ctcctggcac 

tgcacacctt 

tgtgggtgtt 

taagctgttt 

aagcaagtac 

ctggggtgag 

agttgctttc 

aaagacttta 

cagtaggatc 

cgtgtactgc 

tcccaccctc 

cagagaggaa 

acgccatgtc 

ttcatgcctg 

gggccgggac 

actttgctta 

atgtgtaaat 



cgccctctgc 

ggtaccaggc 

gccagtctaa 

gtgctggaat 

accagtggac 

gactttatga 

aaggatttat 

atgcctttag 

aaaaattgga 

agctacctcg 

atgctcctgt 

tagatctgtt 

ccatgccaac 

cagggagcaa 

tgtatggatc 

tggcafcatcc 

ggagcatgat 

cccccatggc 

cgaatggaat 

agactgtgta 

agcagatggc 

tgagtggcgg 

aacaaaacac 

tccacccctg 

aagtcatttg 

gctttatgtt 

cagcacctta 

tgagtccctt 

gtatattagg 

ctaagtgttt 

tgaaatcaaa 

gcaagccccc 

tggatctggg 

tgaagatccc 

tggctccgtg 

ttgtgtgtgt 

tccccatctg 

ttaatittatc 

attgataact 

tgtatccagg 

agctttcctc 

aaagatttca 

tccttaataa 



ccccgccggc 
tgtcctggcc 
agggccgcga 
ccacaggaat 
tcaagaacag 
agcctatctt 
tcgagacaaa 
gaaagaaaaa 
acctgttgtt 
gaaaagctcc 
ggcctgctcc 
ggcctctgtt 
tgcagggagt 
atcagaagaa 
ccagacgcct 
cacagcctac 
gcctccacca 
catgcctgca 
gatgaccacc 
tggggtccag 
tgggatgaac 
aaatggacag 
ctgtatggct 
acccccatcc 
gggtttggca 
gtacatgccc 
gaagttgttg 
ccctcaaggt 
caaacagggg 
aaatttgaaa 
ttaaatactc 
tcctatgagg 
gcttcaggac 
acacacagac 
gctggaggac 
gcgtgaagtg 
ctctgggtat 
agcagcc taa 
ccctttctcc 
gtgctctgtt 
tcagtcattg 
tgtgtgggaa 
atattgcagg 



accctggcca 
aacctgctgc 
tgggcctctt 
ctgggggtgc 
attcagtigca 
cctgagacct 
tatgagaaga 
gatgacaagt 
tttgagaagg 
ccgaaatcca 
attgcaaata 
ccatcccctt 
gccggctctg 
ataggcaaga 
caaatgccta 
cccagcttcc 
gtaggcatgg 
ggctatatgg 
cagcaggctg 
ccagctcagc 
ttctatggag 
gcagcaaatc 
gccattctct 
tcttttccta 
tcctgcccag 
catagccatc 
gcagaaggca 
taaagctcct 
aaagcttaga 
agcatcatgt 
cctgggtcct 
atgagcaaaa 
ttgctgcttc 
acacatccct 
caacccctat 
tgtgtgtgtg 
ttttgttttt 
aactgttgtg 
cttcccttct 
tccccaccgt 
ttcaccccac 
ccacagttcc 
gaaggactgt 



tgacaggcaa 
tggaggagga 
ggaacattgg 
acatatccag 
tgcaagagat 
ttcggcgacc 
agaaatacat 
ggaaaagagg 
tgaaaatgcc 
cagcgcctgt 
gtaagaccag 
cttcttcggg 
ttcctgaaaa 
aacagctctc 
ctcaagcaat 
ccggggttac 
ttgctcagcc 
gtggcatgca 
gctacatggc 
agctgcaatg 
ccaatggcat 
agactctcag 
tcagccctgc 
cctct-ctgtt: 
ccacttccca 
ccaacgtcct 
cttaaactgt 
gtcagactct 
ggtccttcta 
tctcatgatt 
gggtcagttt 
atactactct 
agtcagcctt 
tcccgcctcc 
agtgggaatg 
taanaagtgt 
gtttagtttt 
tttttcttat: 
cccggtctgc 
tcccaggtgt 
ttgaaaattc 
tggctgcctt 
t 



gtcggtgaag 
taacaagttt 
tgtgttcatc 
ggtaaagtca 
gggaaatgga 
tcagatagac 
ggaccgaagt 
gagcgaacca 
acagaaaaaa 
catggatttg 
caatacccta 
ttccagaaag 
tctgaacctg 
taaagactcc 
gttcatggct 
acctcctaac 
aggagcttct 
ggcatcaatg 
aggcatggca 
gaaccttact 
gatgaactat 
tcctcagatg 
gctctcccct 
tggtttagaa 
aacatgaaga 
ccccagtcct 
gggagaagtg 
cagaagggtc 
tatgtgttaa 
tatgggaatg 
gaccctagcc 
cttcgccctg 
tattagcacc 
cccctgcctt 
cagagcttaa 
gtgttccgcc 
aggtttacaa 
ggtttaaaaa 
tgatcactct 
acgaggcaga 
agacaagaaa 
tctcctgtgt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

102O 

108O 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

204O 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2561 



<210> 3 

<211> 2710 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> inisc_f eature 

<223> incyte ID No: 237703 .2 

<220> 

<221> unsure 



3/13 



wo 00/75298 




PCT/USOO/15344 



<222> 712. 799, 2332, 2334, 2342, 2470, 2611, 2682 
<223> a, t, c, g, or other 

<400> 3 

caggtacctt attacattat tttatttaat catcatgtat: ccctataagt aatattcttc 60 
tcttttatag agtaagaaat tgagattcag gaatattaat ttgcccagga tcatcgaagt 120 
ggaatgaaca tcaaaagcct attccctctg cttggccact tccacctcat tttactaagt 180 
ttccccatgt ctgtgttagt aaactaataa ctaaaagggt ctcgcatttt aaatagcttt 240 
ttaacccaag agcatgccac atttaaccag aggcccatag aacaaactga aaattacaac 300 
ctaaaaggtt gtttctaagg ttgtattgag aaggaattga gctcttgaat ccctagaatt 360 
ccttattaat actttattct tctgttaaaa gttttatttt. taaaagtttc atacagtgtg 420 
tatattggtg tgataatcct acagaaaaat caagcagtta tgttttcttc acagataaca 480 
cataaaatat taaacagaaa gcctatgtta ttcattggac tgaagctttt atgcaataaa 540 
ccttagttgg accaggagta aatgtatggt ttgatattca gagaatctca ttcttagaag 600 
caacaaagtg tagttaacac taacttgttc attcttaaat cagtagtcct ctcctcccca 660 
aaaagagatc ttaaattatt ttcatttaaa gtcatctact aacaagtaag tntttattca 720 
acttaattaa atctaacacc acaagacaat tttgttttag ttattgtttt ggtttgagtt 780 
gagttgaaag atttctttnt tttcttctca gcttaccaca gtgcggagac tgctttctga 840 
aaaggccact cacgtgaaca ctagggatga agatgagtat acccctcttc atcgagcagc 900 
ctacagtgga cacttagata ttgttcagga gctcattgca cagggggccg atgttcatgc 9 60 
agtgactgtg gatggctgga cgcccctgca cagtgcttgt aagtggaata atcccagagt 102 0 
ggcttctttc ttactgcagc atgatgcaga tatcaatgcc caaacaaaag gcctcttgac 108 0 
ccccttgcat cttgctgctg ggaacagaga cagcaaggat accctagaac tcctcctgat 1140 
gaaccgttac gtcaaaccag ggctgaaaaa caacttggaa gaaactgcat ttgatattgc 1200 
caggaggaca agtatctatc actacctctt tgaaattgtg gaaggctgta caaattcttc 1260 
acctcagtct taacaattct agtaattttc ctaagtttct aaataccagt gcctcctgtg 1320 
tgtgagatgt attcccataa tcaaagttga cgtcaaacat: cttactacaa aaattcagtg 1380 
acattcatta taacattctt ccaagtgaat tgcctgactt tgatgtcaaa atgtatttga 1440 
aagtaatttg catatatctit taattatttc tgtggagttt: gtgatttttt tatcagaaat 1500 
aatttitaatg tgtgtatact taaaaacttg acacgggttg tacagaaact ggtatttttg 1560 
gtgctgatac aagagaaatg tatttttaaa tatcccacat cctggatctt: tgttgggtat 1620 
ttagtatatt gacatatatt tttataaggt gaggtaactc agaacttaat ttaaaagtct 168 0 
taaatattct gatacaattc agctgtcttc tctaccttac catagccagt tgctttcatt 1740 
ttaaaccaga gcaagtaaca tattagtgac ttgaatcttc ataagttaaa gtaaaaaaca 1800 
gcaaaaaacc tagatctttg tcttttagaa cacagaccat tttcaggaaa gcagttagct 1860 
aagtgtttaa ttcatgaata ttgtatactg catcccctac cacaatttac acaatcctgt 192 0 
ggatagtcct acctcaccct ggtcaaccta catgatcctt aagctaatgg cgaatcacga 1980 
tgaccttgta gacatgcaca caactatacc tttgtccaac agatcataat atatctgcta 2040 
tccaactggt tttacctgcc taatcctact gatttgggca ctgcttgtat agtctctcaa 2100 
gttcacagga aatgttgatt ttctaaggtc ctcattttta cagagtatac aggcaaagtg 2160 
acaggggaaa aggaattagt ctaagagtaa ggggatgatt: attatattga ggctaaaacc 2220 
acaaagtggc tcaggcttta aaaaaaaaac actgtggata atgacaaaaa gcataagtaa 2280 
aaatatttga gaaaaataaa gtacaagttt tgaacaacac aaaaggcatg antncatttt 2340 
tnacctgtgt atgtctttct tggatccaga acattattca tccagcacgc acttagttat 2400 
ttaacatcta ctcactcagt ctctccagca gcaatttttg cattgtctat ctagcccctt 2460 
tgtgattgtin cccaaagttt tgtcttctca acaccacaac actccagggg aagggaacta 2520 
aaccagttgc tctttacttc agttaaattt ttaagatgtc caccaaggct: tatctctttc 2580 
aagccatcct acgtaaccca gtcaccctag nctaagtaat aatgttattt aatcaaaggt 2640 
taaatattta tttttgctta gaacttatta gatcatctca gnaaaagtica gaggtaatat 2700 
ttgggcctgg 2710 

<210> 4 

<211> 2059 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 240091.1 

<220> 
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<221> unsure 
<222> 1850 

<223> a, t, c, g, or other 



<400> 4 

cgcgctccgg acctggcagg cggcggctgc 
gggacgcagg gagcgaccag aggcagaatg 
gcgaggagtg gtgtgtcatt gatgactgtg 
atatagatga ccccaaatgg acactttgct 
gtacagctcc acctatctac cagttgaatg 
atttatcaaa tagccttgag gaaatatata 
tgtgggtgga gaaaataaga gatgttctta 
cagatigtaaa gaagaaaact gaagaggaag 
catgtcagcc ggaaagttcg gttaaagcat 
aagtagaagt agaagaatta cctccgattg 
gtacttttca ggcacacttg gctccagtgg 
ccaaattgta tgagaataag aaaatagcta 
tatat-tgtga ggataaacag accttcttac 
ctggtgggcg tcttcttcat ctcatggaga 
tatcacgctg gtatggaggg attctgctag 
gtgccagaaa catactagtg gaaaagaact 
ctttgggaaa gaacaaaaaa gtaagaaaag 
aactatagga aaggttaatt tgcctataat 
atattgtgca gagagagtat ccttgactgc 
cattagcttt tcttcttggt tatatcatct 
tgtgtgt-ttc aggcttattt gggagaacta 
ttagcaaggt aacagttgcc cagggcagta 
caagtigcctt: tgttaggtgg agaagaaatg 
tgtcatcgag attcttgtac tgttaaatga 
ttggaatagg agctcattta agattgatct 
taagtatgtc acctttcatt ttatagtgtt 
caggagtatc catctgcagt tatgtgctga 
ataaatatta tgcttcagtit tctgttgcaa 
tttctagtct gctttttttg tttaattctt 
tagtctcctt ccaccccaga aatgtgttgg 
ggaagtacct ttcttgtgat cttcactgag 
ggcaaatggg acattcgtag agtgggatag 
gagtatgttg tgtagtacat caatttgatg 
tctcagttcc aagattttgc agagagaagg 
aaccctcccc cccattttt 



agggcaggtc caggggccac atggctgagg 6 0 
aggaaattga agcaatggca gccatttatg 120 
ccaaaatatt ttgtattaga attagcgacg 180 
tgcaggtgat gctgccgaat gaatacccag 240 
ctccttggct taaagggcaa gaacgtgcgg 3 00 
ttcagaatat cggtgaaagt attctttacc 3 60 
tacaaaaatc tcagatgaca gaaccaggcc 420 
atgttgaatg tgaagatgat ctrcattttag 480 
tggattttga tatcagtgaa actcggacag 540 
atcatggcat tcctattaca gaccgaagaa 6 00 
tttgtcccaa acaggtgaaa atggttcttt 6 60 
gtgccaccca caacatctat gcctacagaa 720 
aggattgtga ggatgatggg gaaacagcag 7 80 
ttttgaatgt gaagaatgtc atggtggtag 840 
gaccagatcg ctttaaacat atcaacaact 900 
acacaaattc acctgaggag tcatctaagg 9 60 
acaagaagag gaacgaacat taatacctga 1020 
tatatataca ttccatagtc atcaaggaat 1080 
ttaagtcagc cagttcagca tggataccaa 1140 
gccaaaaata gagaacttat gatctattca 1200 
atttgaacct aatcaccact tcatctaatt 1260 
cctgaattaa ctgtccattt cagtacatgt 1320 
tctctagagg aatataaata cctgatttct 1380 
atat-tgcctt ttactgctct ttatggctta 1440 
tggagagttt cttcttgtga ttttagttca 1500 
catcattgag taatggatta agtgaaaatc 1560 
ggtgataatt catccaacat: atttgttagc 1620 
attggtgatt gtgaaattac agaaagtgat 1680 
gtaatgtaag caataaatat ggagtgtcag 1740 
tgtaacattc tcgtttcttt taacaacctg 1800 
gaattagaac tatgatagan gttaggctgt 1860 
aggtggcaga atgaacctgg tgtagggcag 1920 
catgctttcc atctgcactc cagacggctt 1980 
agcaaacctt ttcattggaa aaacagaaac 2 040 

2059 



<210> 5 
<211> 3705 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No; 243096.6 

<220> 

<221> unsure 

<222> 13-14, 2121 

<223> a, t, c, g, or other 



<400> 5 

cgcagtgccg gannccgcag cgccggaacc 
gtcagctgcg ggagcgtttc cggggacggt 
tgcagcgccg cccaggccgc ctggcgggag 
cgctggttcc cgggccacat ggccaagggg 
gtggactgta tcatcgaggt ccacgatgcc 



tcagaggcgg gtcgcagcgg cgcagaggag 6 0 
gccgccatga gattgacccc gcgcgcgctg 120 
aacttccccc tgtgcggtcg cgacgtggcg 180 
ctgaagaaga tgcagagcag cctgaagctg 2 40 
cggatcccac tttcaggccg caaccctctg 3 00 
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tttcaggaaa cccttgggct taagcctcac 
gatcttacag agcagcagaa aattatgcaa 
atttttacca actgtgtaaa ggatgaaaat 
ctgattggga gaagccaccg ctaccaccga 
attggggtcc ccaacgtggg caagtcctcc 
aggaaaggga aagccaccag ggtgggtggc 
aaaattcagg tctctgagcg gcccctgatg 
cctcggattg aaagtgtgga gacaggcctg 
cacctggtcg gggaggagac catggctgac 
cgctttgggt acgtgcagca ctacggcctg 
ctgaagagtg tggctgtgaa gctggggaag 
ggtaacgtga acgttattca gcctaactat 
ttccgccgtg ggctgctggg ttccgtgatg 
ccggctgaga ctttgccctg aacttgtccg 
cagacctcct gacctgggtg gttgaggctc 
gctggt-cact agggtgctgt gctctctggc 
cagttgcagg gcccaagcag gtgggagtgg 
cagctccgca tgcttggttc tcccggagct 
tgtctcagaa ggagttaaag ctataacctg 
gtttcttact ggcctgtgag gtgcaccgta 
tcaccagtgg aacctaagaa atgagcaggt 
ggtccagtgt cttgcagtcc tacaacaagt 
ttcacactta caggcacaca cagacacaga 
ctggcttcgc ccctcagtgc cctggggcat 
cacatggctt tgctgtgcaa tgactggaag 
gcctctgagc accttctagc tcactggtag 
gtgggggccc ctccagcccc attatagtgc 
aagagggaag aggaaggaag gtgggtgggg 
ctggggccag ggacatcctg tgcagaagct 
cgtcccacag cagcctccac cagggccctg 
tgcctctgct gccccatacc ncacacactc 
ctgcatcgtg ccataaccct gaccgcctgg 
atgctaatgt gctigaatcaa cagtcattgc 
attccatcag cttgcagtgc tgtggtgtgt 
aggtgggcat ctgcaaagtt gagggggctc 
gaccaccccc tgcctcctgg gggaaatgtc 
ggcagggctt ggccttggcc tgctggtctt 
gaggagcgtc tgggctcact gggccagggg 
gggcgctgcc tcctgtggct tagtgccctg 
gccaacagct ticcagatcct cacccaggcc 
ctggcagggc ccgtggtggg tgctggtctt 
aggcccagga gggatgcagt ccggctgagg 
gagatcccaa ggccacgggg gggggggcag 
tgactggaga gctagagaac gtggcagacc 
atgccccagg ctggcgggac tgggaagcag 
gctggaggca agtccttgtc gtgactgtcc 
atgttggggg gctcagcctc ttgcatgggt 
ccccctgctt gctttggggt ctgagttagc 
ccagcccacc acgcggatac ccaggccctg 
aaggggctgc cttcagggaa atgcgtgcac 
tcttcagcgg gattggcagt tgctgtgccc 
atgtgaacct gaagttcaaa ggacttggaa 
aaaatgggtc ctaa^gaggg taaagtgact 
ctcacggatc tcggcctgag ggtgtggggg 
tgtgttttcc caccagccgc agagagccag 
tgcttgggaa tgttcctggg ctgtgagatc 
cgtttttccc tcacttcccc qcaaattctt 

<210> 6 
<211> 3644 
<212> DNA 



ttgctggtcc tcaacaagat ggacttggcg 360 
cacttagaag gagaaggcct aaaaaatgtc 420 
gtcaagcaga tcatcccgat ggtcactgaa 480 
aaagagaacc tggagtactg tatcatggtc 54 0 
ctcatcaact ccctccggag gcagcacctc 600 
gagcctggga tcaccagagc tgtgatgtcc 660 
ttcctgttgg acactcctgg cgtgctggct 72 0 
aagctggccc tgtgtggaac ggtgctggac 780 
tacctgctgt acaccctcaa caaacaccag 84 0 
ggcagtgcct gtgacaacgt agagcgcgtg 900 
acgcagaagg tgaaggtgct cacgggcacg 960 
cctgcggcag cccgtgactt cctgcagact 102 0 
ctggacctcg acgtcctgcg gggccacccc 1080 
ggtagggagg gccggaggca tgtggcctcc 1140 
aagacagctc acccggtcca gaagctccat 12 OO 
gccccacagc ctggccagct ccagggaccc 1260 
acaccaggct tcccagtgga cgtccctgag 1320 
tcctgctcag gcctcttgag aaatggatgc 13 80 
taacctttaa aatctccagt taaagggcct 1440 
gtgccttggg cctgtgtgtt aaagctgctc 1500 
tggcagctag ggtttgtgtt ggaggctttc 1560 
gagaggcttg ctgccatcag agaggtttat 162 0 
ccagagactc ccagcagcag agcccaagca 1680 
gttcagggca gggttgaggg ggacgccctg 174 0 
gccgcccggc atgggcagta gagacccctg 1800 
tgggattctg cattagtggg gctgagagat 1860 
acctgaaggg gtccacagcc tgtgtcctag 192 0 
ctggtagtat ggactaaggt gctgcaggac 198 0 
ccggctgcct ctttgcggtg gtggcctgac 2 04 0 
gtgctcagtg gcccctcttt gctggctggc 2100 
atcagcctga agttagcccc tgagtgccac 216 0 
ggcaggaagt attcaggttg gctgtgtcag 222 0 
agatcacgaa gtgtccatca taactggaac 22 80 
gagggtctgg tgcagctcag cccattttcc 2340 
cggtgggtct ctctgctgtg aggagactca 2400 
agaagggctt ctctgcctat gaggatctgg 2460 
ggaggcgttg agcttggtct ggaaggggtg 2520 
cattgctggc agtgtggacg ggaggctgca 2580 
gagctagaga gcagtgcttg gttgagtcct 2640 
agaacccagg ccagctgggg aaggcagagg 2700 
gactttggtg tccactgagt cccgaggctc 2760 
gcgaggctgt caccaggaca tggagagggt 2 82 0 
ggagaacccc tcctaccctg gatgagtggg 2880 
caagacctct cagtgctgag cccatggagg 2940 
agggctggtc ttaacacagg tgtgtccagt 3 000 
agcgccactc catgtctctc ctgtccttgg 3060 
gtcctgctgg gcgctgggcc ccgccactgg 312 0 
tcctggctcc actgagcagg ccgtcagctg 3180 
ttccgaggcc tggaacagct gcttccgaag 3240 
cgtgcagcct gtgctgtgcc cagggaggcc 330 0 
tgagaacagg cagaactgtg tgatccctga 33 60 
agctctggaa tgtgttggtt tttccccccc 3420 
tgtttcaagt tgttggagca aagtgggtct 3 480 
agaaggcctg gacagcccct cagggcaggg 3540 
gatggacgtt cctcggacgg acggttttcc 3 600 
cactcttctg ggcaggtggt tagcacctaa 3660 
aagtcctttg gtcca 3705 
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PT-1042 PCT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No; 244366.6 

<400> 6 

cfttttccacc tgcaaccatt tgcatgtgta 
tgtacaagtt gtgtttctta atcttccctt 
cat-ttggaat cacctttccc cctcccatgt 
tggctttatt tgggaggggg aagggtgata 
actcctctct gttgcttccc tcttcccatt 
gggctatgcg gcagggcaga tttcccatca 
gagattcaaa cccagcaagt atgtcccggt 
tacgacactc ttctttgcct ttacgtgtcc 
gcccatctac aatgcaatta tgtttctctt 
catggaccca gggattttcc ctcgagctga 
agctcccctt tacaaaacag tggagataaa 
cacctgccgc ttttaccgtc cccctcgatg 
ggaggaattt gatcatcact. gcccctgggt 
ttattttttc cttttcctcc tttccctgac 
cctcctttat gtcctctacc acatagagga 
ggcagtaatg tgtgtggctg gcttattctt 
cgtggttctg gtggccaggg gacgcacaac 
aggtgtgaac cccttcacca atggctgctg 
tccagcaccc aggtatttgg ggagaccaaa 
cttccttcga ccagaagttt cagatgggca 
ccagggagag ctgaggagaa caaagtctaa 
tgcagatgct gaacctccac ctcctcctaa 
aacacacctc ggcct-ggcta ctaatgagga 
gacacctacc atgtacaagt atcggccggg 
tgccgcattc ctccagcgcc aagttgagtc 
ttgcagagag cagccgtcac cccagctacc 
tccgttctcc tacctttggc aaaagttttc 
cctccagcct caagtcagcc cagggcacag 
gttcagaggg caccacctcc acctcctata 
gcctatctta tgacagcttg ctcacacctt 
cagggcctga gccagaccca cctttaggct 
cccagcaacg ggaagctgag aggcacccac 
agccctcacc agtccgttac gacaatctgt 
gagagaagtt gctgcgccag tcacccccac 
gggactcagg cattcagtcg acaccaggct 
cagatgattc aaagagatca cctttgggca 
gttttggcaa gccagatggg ctaaggggcc 
acagccccat acctgggccg atcgatgtct 
tctgagacag aagaagtggc cttgcagcca 
aagaccacct acagcaaatc caacgggcag 
ccaggccagc cacctctcag tagccccacg 
ggtggtacca cctatgagat ttcggtgtga 
tgcgcctaca ccaaagggcc ccaggtggcc 
gtgcatggac attttttaaa ccaccgattc 
agtaggcttg gggagtcgga gagttggggc 
atcttttaag accttccctt ccttgatccc 
tgctcgccct ggagggaacc agatcatttt 
tacggattct atttttttcc tcttctgcgt 
tatatatata ttataaatat caaagaaatt 
gagggataca tatacggagg gggatcttac 
aggggagacg tcagtctttt tcctgtggtt 
ggaaatagca tcctctgctg gtgctaagtg 
tctggaattt tggtcacaag agggaaggac 
ttggcattta gtttccctag gaaaggggtc 



cagcctactg tttgtctcca gtttttaaac 60 
ctgccttgtt ctggggaggt ggttattcat 120 
gctttccttc atttgagatc ttttgacctt 180 
aagttttctg tttccctggt tttcttttgt 240 
ttcttgtctg ttctgccgct gtgtgggcct 3O0 
gagctccaac atgcccgcag agtctggaaa 3 60 
ctctgcagcc gccatcttcc tagtgggagc 420 
aggactaagc ctgtatgtgt cacctgcagt 480 
tgtgttggcc aacttcagca tggccacctt: 540 
ggaggatgag gacaaggaag atgatttccg 600 
gggcatccag gtgcgcatga aatggtgtgc 660 
ttcccactgc agtgtctgtg acaactgtgt 720 
gaataactgt attggtcgcc ggaactaccg 780 
agcccacatt atgggtgtgt ttggctttgg 840 
actctcaggg gtccgcacgg ctgtcacaac 900 
catccctgta gctggcctca cgggatttca 960 
caatgaacag gttacgggta aattccgggg 1020 
taacaatgtc agccgtgttc tctgcagttc 1080 
gaaagagaag acaattgtaa tcagacctcc 1140 
gataactgtg aagatcatgg ataatggcat 12 00 
gggaagcctg gagataacag agagccagtc 1260 
gccagacctg agccgttaca cagggttgcg 1320 
tagtagctta ttggccaagg acagcccccc 13 80 
ttacagtagc agcagtaccg tcagctgcca 1440 
gtggggacag cttgaaggag ccaacctcaa 1500 
gctcagagcc cagcttggaa ccagagagct 1560 
acttcgatcc actatccagt ggctcacgct 1620 
gctttgagct gggccagttg caatccattc 1680 
agagcctggc caaccagaca cgcaatggaa 1740 
cagacagccc tgattttgag tcagtgcagg 180O 
atacctctcc cttcctgtca gccaggctgg 1860 
gtttggtgcc aactggccca acacaccgag 1920 
cgcgccacat tgtggcctct ctccaggaac 1980 
tcccgggccg tgaggaagaa ccaggcttgg 2040 
cgggccatgc ccctcgtact agttcctcct 2100 
agactccact. gggacgccca gctgt-ccccc 2160 
ggggagtagg gtcccctgaa ctcaggccca 2220 
tacagcagcc aaaaagccca acctggtgtc 2280 
ttactgacac ccaaagatga agtacagctg 2340 
cccaagagct taggctcagc ctcccctggc 2400 
aggggaggag tcaagaaggt gtcaggggtt 2460 
gccttcggca cctcccctcc ccaacgcctc 2520 
accttccttc cctcaagggg ctcccctccc 2580 
caagaggatg aggagtgttt tctaaaatgc 2640 
cctgagactg gggtagcaac ccccccttttn 2700 
tggaccagac tcagtggaca tttgtgcaat 2760 
taaaccagaa ataatttttt ttattattgt 2 820 
taccaggtgt gtgtgtacat ataatatata 2 880 
atatatctat cctgggatgg gaaaatgagg 2940 
tcttcccatt cctcagacca gcaggaaaag 3 000 
ccctctcatt tgtcccagtt actaactacg 3 060 
tgattaggaa gaagcctggg gagaggcgag 3120 
ttggagagga gaattagttt tctaggctca 3180 
aaaacttcaa gacactggtg gtggtgggag 3240 



7/13 



wo 00/75298 



atcaggaaaa taacttggcc tagctcaaac 
gggattagag tgtgctccta ctggcccctt 
gattttaaaa tccaaggcca ggagagaaga 
caaaaacggg ggatagagag aaggagtggc 
ggttacccct cagcccacct cactatggtg 
tctaaatagg ggagatccca gcctccacaa 
ccattttaaa ccaacgagga ataaaaagaa 



aatattggat aatcccctcc ttgggggaga 3 3 0O 
ggagcctccc ctagcttaca cagttaactt 3360 
atccaaaaag caatattttt catcacatgc 3 420 
aggcctaggc ccctccgatt gtcccttggg 3 480 
ctgggtagag gggatacctg ggttctaacc 3 540 
agaggccctt ttatttttta ttctgattag 3 60O 
atcctgatct aaaa 3 644 



<210> 7 
<211> 3117 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No; 405313.4 

<220> 

<221> unsure 

<222> 64, 521, 534, 547 

<223> a, t, c, g, or other 



<400> 7 
gtccgcccgc 
cgcngccgcc 
gtcgggagaa 
gggaggagga 
ggaatattgg 
atatatccag 
tgcaagatat 
ttcgaagacc 
agaaatacta 
aagaganaaa 
tgcagaagaa 
agcccactgt 
acacaacggt 
ccttacctgc 
ccctgtctac 
aagaagtggc 
ccattcaaca 
cacaagcacc 
ctggccttat 
ccccaatggg 
tggcccccaa 
gcaagctcag 
tatcagtagt 
gtctggaagc 
caagtttcat 
ttattcatat 
tgaccgtgtt 
tttatgtcaa 
aagctgctca 
atataaggga 
attagccagt 
atgtaaatgt 
cacttcactg 
taaacgtgga 
ctgcctctct 
ttctgcacta 
cttgaaggta 
tgacaaaatg 



ggtcccggcg 
gtagctgccc 
ggctcagaag 
caacaagtac 
tgtgtttatt 
ggtcaaatca 
gggaaatact 
acagacagat 
cgataaaaat 
gagagaaaag 
agatcagcaa 
ggatctttta 
gccacccctg 
aactgtcatg 
agtaacatct 
aaagaaacaa 
gcaaagtact 
agctgcattt 
aggaaatgtg 
tttatgggaa 
ggaggaatgg 
cagccccagt 
gcaaccccta 
tcatcaggtc 
ccagaactac 
gcatattttt 
ggtctgtact 
gggcagcttt 
acttgcaaaa 
agtagttatc 
aatcctgtag 
ctcactgagc 
tgctgttgtt 
tgttactcca 
tgtaatttgg 
tatgcaaaca 
tatctiaggtt 
ttatttccct 



gcgccaggtg 

caggctcccc 

ctgaacgagc 

tgcgccgact. 

tgcatcagat 

gtcaacctag 

aaagcaagac 

caagcagtgg 

gccatagcta 

gagccagaaa 

ctggagccta 

ggactitgatg 

aacgatgatc 

cccccagctc 

ggggatctag 

ctttccaaag 

cctggtgtat 

cagggctttc 

atgggacaga 

atgcacaaac 

tgggacaaat 

ggagcctctc 

ctgcaggttt 

agactctcag 

cacctgacat: 

tttcttttta 

gattcaattt 

gctcatattt 

tcagttttcc 

atgttagtaa 

gaaggtactg 

actgttttct 

atgatgtgct 

aaacttcgtt 

atctcttctt 

gggtaactaa 

tatgacagta 

acattaaaca 



cgttcactct 
gccccgctgc 
agcaccagct 
gcgaggccaa 
gtgctggaat 
accaatggac 
tactctatga 
aatttttcat 
ttacaaataa 
agccggcaaa 
aaaaaagtac 
gccctgctgt 
tggacatctt 
aggggacacc 
atttattcac 
actccatctt 
ttatgggacc 
catcgatggg 
gtccaagcat 
tggtgtgatg 
gggCgcaccc 
acagatgaat 
tggccagccc 
cacacaactg 
tcc-ttgctga 
cccatttgtt 
gatgtggtga 
cccatgattt 
tctcaataaa 
tacctctaat 
tatgatcaaa 
agtgtatcaa 
taacagggaa 
taatgaatgc 
aatgtacata 
ctiaaaacaaa 
attgtgttta 
tgactccata 



gcccggctcc 

cgagatggcg 

catcctatcc 

aggtcctcga 

tcatagaaat 

agcagaacag 

agccaatctt 

cagagataaa 

ngaaaaggaa 

accacttaca 

cagccctaaa 

ggcaccagtg 

tggaccgatg 

ctctgcacca 

tgagcaaact 

atctctgtat 

cacaaatata 

cgtgcctgtg 

gatggtgggg 

ccacttcctc 

cagagtaagt 

cagcagatgg 

tccagcacaa 

tggaaatgaa 

aacgcatcta 

catattaaga 

aaagcaggtt 

catgtactgc 

attatagctc 

agtataaacc 

tgtttaatca 

aatgctctta 

cgtgattagt 

ttaaagaatt 

gtgctaacat 

gccaetttca 

cattttatgg 

gaccttttca 



agccagcgtc 

acgcgctcct 

aagcttctga 

tgggcttcct 

cttggggttc 

atacagtgca 

ccagagaact 

tatgaaaaga 

aaanaaaagg 

gctgaaaagc 

aaagctgtgg 

accaacggga 

atttctaatc 

gcagctgcaa 

acaaaatcag 

ggcacaggaa 

ccatttacct 

cctgcagctc 

catgcccatg 

agaacgttgt 

ttggcctgcc 

ctggcatgag 

cagcaggatg 

aactgcaata 

gttcccctgt 

atgatctgat 

gataaatcat 

attatttgag 

taatgtttgc 

ccaccccaaa 

tataaataga 

tttcatcatt 

gaaaggaaga 

caaatitttat 

gaagaccttt 

atcttcaatc 

tgcctagtat 

tatgtgggtt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 



8/13 



wo 00/75298 



PCT/USOO/15344 



tttatttcct atgatgtata 
caggaatcat caggaacgtt 
gtctaccaac ctgttcaagt 
tatactcaag agtggtatct 
tgttgtagag aaaacatgca 
caattagctt atgttaactg 
aggctgctgt actgaagtaa 
catcctcttc catatggatc 
tacctcaatt ttcactgtgt 
tgtccgagtg tgccacacga 
tcatggtgac actcgaggtc 
gctttcggtt ccagttcttc 
Gctggctggg atccacgacg 
cttacagcaa atcctttgtg 



ctgccactaa ccttccaaaa 
tagctgacaa aatacttgtc 
ctaccaatta taagggcaaa 
tgcagtatcg gcactgtaca 
gaacaaatga agacaaaaca 
acaagctcca tt-taaacaga 
aacaaacaat acctgaatgc 
cactggctgg acaaactgca 
ccaggtggta ctttggctcg 
gaacctgaag gggaaggaaa 
gggcagcaca agtgtaatga 
gactgttgtt atctgtttga 
cttaaataca gcttttggat 
aaaaataaaa aaaaaaaaag 



attacttagt attgcaaagt 2340 
tgttttaaaa acctgttcaa 2400 
ttggagaaaa agaaaaaata 246 0 
aaaaaatctt ccaatttagt 252 0 
tacattttgt accaaccatc 258 0 
tgtccatcag atgacaagaa 2640 
tctgtagcct aaactccaaa 2700 
ccagttgctg cttcaattta 2760 
ttggctagat taaccttctc 2820 
tagcttgggt agcgcactct 2880 
ataccttagt gcagttattt 2940 
gaaagtcaga ttcttgcatc 3 000 
tggacaaaat gacttgaaga 3 060 
agactttaaa aaaaaaa 3117 



<210> 8 

<211> 2235 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 436857.2 

<220> 

<221> unsure 

<222> 289-319 

<223> a, t, c, g, or other 



<400> 8 

ttcatcccgt atctgcgcgt atgagatgca 
aatacctccg aagccgcttt gttctccaga 
ttccttccgg gggacaacgt gggtcagggc 
ggctttcatg ggactccctc tgccacattt 
cagaactcca gcctaatgga tcccaaactc 
nnnnnnnnnn nnnnnnnnng cggcatgttc 
gagaaagtct tccagtacat tgacctccat 
tgggtggcca tcgagagcga ctctgtccag 
agaatgatgg ccgtggctgc ggacacgctg 
gacatgggtc ctcagcagct: gcccgatggt 
gccgaactgg ggagcgatcc cacgaaaggc 
cagcctgctg accggggcga tgggtggctc 
gggaaacttt atggacgagg agcgaccgac 
gctgtgagcg ccttcagagc cctggagcaa 
gaggggatgg aagaggctgg ctctgttgcc 
cgattcttct ctggtgt-gga ctacattgta 
aagccagcaa tcacttacgg aacccggggg 
agagaccagg atttticactc aggaaccttt 
ctggttgctc ttctcggtag cctggtagac 
tatgatgaag tggttcctct tacagaagag 
gacctagaag aataccggaa tagcagccgg 
gagattctaa tgcacctctg gaggtaccca 
tttgatgagc ctggaactaa aacagtcata 
cgtctagtcc ctcacatgaa tgtgtctgcg 
gatgtgttct ccaaaagaaa tagttccaac 
cacccgtgga ttgcaaatat tgatgacacc 
acagtgtttg gaacagaacc agatatgatc 
atgttccagg agatcgtcca caagagcgtg 
ggagaacatt cgcagaatga gaaaatcaac 
tttgctgcct ttttcttaga gatggcccag 
atctgatcca ctgacagatt cacctccccc 



ttgtctcttc ctctgcagtt gagctgaatg 60 

tgtgaatagc tccactatac cagcctcgtc 120 

acagagagat atttaatgtc accctcttgg 18 0 

tttggaggtt gggaaagttg ctagaggctt 240 

gggagaatgg ctgcgtccnn nnnnnnnnnn 300 

tcctcaccct ccccgccccc ggcgctgtta 360 

caggatgaat ttgtgcagac gctgaaggag 420 

cctgtgcctc gcttcagaca agagctcttc 480 

cagcgcctgg gggcccgtgt ggcctcggtg 540 

cagagtcttc caatacctcc cgtcatcctg 600 

accgtgtgct tctacggcca cttggacgtg 660 

acggacccct atgtgctgac ggaggtagac 720 

aacaaaggcc ctgtcttggc ttggatcaat 780 

gatcttcctg tgaatatcaa attcatcatt 840 

ctggaggaac ttgtggaaaa agaaaaggac 9 00 

atttcagata acctgtggat cagccaaagg 960 

aacagctact tcatggtgga ggtgaaatgc 1020 

ggtggcatcc ttcatgaacc aatggctgat 1080 

t:cgtct.ggtc atatcctiggt ccctggaatc 1140 

gaaatiaaata catacaaagc catccatcta 1200 

gttgagaaat ttctgttcga tactaaggag 1260 

tctctttcta ttcatgggat cgagggcgcg 1320 

cctggccgag ttataggaaa attttcaatc 1380 

gtggaaaaac aggtgacacg acatcttgaa 1440 

aagatggttg tttccatgac tctaggacta 1500 

cagtatctcg cagcaaaaag agcgatcaga 1560 

cgggatggat ccaccattcc aattgccaaa 1620 

gtgct-aattc cgctgggagc tgttgatgat 1680 

aggtggaact acatagaggg aaccaaatta 1740 

ctccattaat cacaagaacc ttctagtctg 1800 

acatccctag acagggatgg aatgtaaata 1860 



9/13 



wo 00/75298 



tccagagaat ttgggtctag tatagtacat 
tctggatcag taataaaata tttcaaaggc 
actgcacacc ttcctcaagt catagctgct 
caatagcccc aggattggat tccttccaac 
attggcataa tcactccggt ttgctttcta 
tccatccaat gatcgccttt gctttaccac 
tggtctccac cactg 



tttcccttcc atttaaaatg tcttgggata 1920 
acagatgttg gaaatggttt aaggtccccc 1980 
tgcagcaact tgatttcccc aagtcctgtg 2040 
cttttagcat atctccaacc ttgcaatttg 2100 
ggtcctcaag tgctcgtgac acataatcat 2160 
tctttccttt tatcttatta ataaaaatgt 222 0 

2235 



<210> 9 

<211> 542 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 247285.1- j 



<400> 9 

cggggactga gggaagaagt gaaaatcgga 
ctcgccggct cctgagcggg ccaccgggcc 
ttatttggac tgagagctgg agaatgagaa 
gagaggtgtt tgagcccaga tgagtcatgg 
tggttctgga agaagattat gatgagacct 
ttgcccggga gattggtatt gatcccatca 
agggcatcgt ggccccactg cctggagagt 
tttactattt caacttcgcc aacgggcagt 
atcggagctt ggtgatccaa gagcgggcaa 
ag 



ctgccaggcg acagttcctc cgtttgaaat 60 
cgggctgggg gtctggcggg agaaataact 12 0 
taggacctga gagtatattg ggctaaggag 180 
ctggacgacc cctccgcata ggagatcagc 240 
acattcctag tgagcaagaa attcttgaat 3 00 
aggaaccaga actgatgtgg ctggcgcgag 3 60 
ggaaaccatg ccaggacatc acaggtgaca 42 0 
ctatgtggga ccatccatgt gacgaacact 480 
agctgtcaac ttctggggcc attaagaaga 540 

542 



<210> 10 

<211> 358 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 254510-1. j 



<400> 10 

cggacggcgt ggagtgactg tcccaccgcc 

cfcgaggaaga aacccggaag aggaagagga 

agggtctatt gacattcagg gatgtggcca 

tggaccctgc tcagaggact ctatacagag 

tctccctigga tacctcttcc aaatgcatga 

atacagaagt ggtccacaca gggacattgc 

<210> 11 

<211> 1481 

<212> DNA 

<213> Homo sapiens 



gcgggattga cttctaaaga ctcttggtac 60 
gagcaaagga gtcagggatg gctttttctc 12 0 
tagaattctc tcaggaggag tggaaatgcc 180 
acgtgatgct ggagaattat aggaacctgg 2 40 
tgaagatgtt ctcatcaaca ggacaaggca 3 00 
aaatacatgc aagtcatcac attggaga 358 



<220> 

<221> misc^feature 

<223> Incyte ID No: 284125.2. j 



<400> 11 

gtgttgcgcg actggccttg agggagagct 
tcgatcgaaa tcgaatcttc ggatgtgatc 
agtttacatc gggcgttagc caccttgcag 
gacagcattg agagttttgt ggctgacatt 
gctatacagt ctctgaaatt gccagacaaa 



ggggcctgct cccggagaga tacggctatg 6 0 
cgccttatta tgcagtactt gaaggagaac 120 
gaggagacta ctgtgtctct gaatactgtg 180 
aacagtggcc attgggatac tgtgttgcag 240 
accctcattg acctctatga acaggttgtt 3 00 
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wo 00/75298 



PCT/USOO/15344 



ctggaattga tagagctccg tgaattgggt 
cccatgatca tgttaaaaca aacacagcca 
gccaggtctt actttgatcc tcgtgaggca 
agagcagcaa ttgcccaggc cttagctggc 
atggcattgc tgggacaggc actgaagtgg 
atgaccatag atttgtttcg aggcaaggca 
cctacacaac tgagcaggca tattaagttt 
ttttctccag atggtcagta tttggtcact 
aactttacta ctggaaaaat cagaaaggat 
atgatggatg atgctgtcct ctgcatgtgt 
ggggcccaag atggaaaaat caaggtgtgg 
tttgagaggg cacacagtaa gggtgtcacc 
at-ccttagtg cttcttttga ccagacaatt: 
ctgaaggaat ttcgtggcca t-tcctccttt 
cattacatta ttagtgcatc ctctgatggc 
gaatgttcaa atacctttaa atccctgggc 
cagtgtgatt ctacttccta aaaaccctga 
ggtggtcatc atgaacatgc aggggcagat: 
aggtggggac tttgtttgct gtgccctctc 
ggaggacttt gtgctctact gtttcagtac 

<210> 12 

<211> 2439 

<212> DNA 

<213> Homo sapiens 



gctgccaggt cacttttgag acagactgat 360 
gagcgatata ttcatctgga gaaccttttg 42 0 
tacccagatg gaagtagcaa agaaaagaga 480 
gaagtcagtg tggtgcctcc atctcgtctc 540 
cagcagcatc agggattgct tcctcctggt 600 
gctgtcaaag atgtggaaga agaaaagttt 660 
ggtcagaaat cacatgtgga gtgtgctcga 72 0 
gggtctgttg atggattcat tgaagtatgg 780 
cttaagtacc aggcccaaga taactttatg 840 
ttcagcagag atacagaaat gttagcaact 900 
aagattcaga gtggacaatg tttaaggaga 960 
tgtctaagct tttctaagga tagcagtcag 1020 
agaattcatg gtttaaaatc tgggaaaacc 1080 
gttaacgaag caacatttac acaagatgga 1140 
actgtaaaga tctggaatat gaagaccaca 12 00 
agcaccgcag ggaccagata ttaccgtcaa 1260 
gcactttgtg gtgtgcaaca gatcaaacac 13 20 
tgtcagaagc ttcagttctg gtaaaagaga 13 80 
tccccgtggt gaatggatct actgtgtagg 1440 
agtcactggc a 1481 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 3 31554.4 .j 

<220> 

<221> unsure 

<222> 7, 19, 41, 624, 1062 

<223> a, t:, c, g, or other 



<400> 12 

ccggatntta gtgtcagang cgcccccagc 
aaggaactgc tcaagtccat ctggtacgcc 
aaagtctcca agtcccagct caaggtgctg 
ccccatgacc ccgtggccct ggaggaacac 
agccagggat acatgcccta cctcaacaag 
tttgttaaag agcactttga tgagctgtgc 
gcagatagca acgggaacag tatgctctcc 
ttcaacttcc tgtctgagga caagtaccct 
ctgctgaaaa aggtactcag cagcatgagc 
cttctggccc aggaggccca ggtggcccag 
ctggagctct: tcaattcggg ccgntgcctg 
gccatccacg aggtctacca ggagctcatc 
aagcgagggc acctgagaag gaactgggcc 
ctctggctac tttgggagtg aagagtgcaa 
acactgctgc gtggaggtgc tgccagaccg 
gacagccacc cgcacgtatg agatgagcgc 
gctgccatcc agatggcgat ccggctgcag 
ctgaagcaga aacggcgcga gcagcgggag 
gaggagctgc tgcggctgca gcactgcagg 
gctgctgcag gaggcgcacg gcaggccgag 
cgcagccagc accgcgagct gcagcaggcg 
gcccgggcct: ccatgcaggc tgagatggag 
cagcgcattc aaggagctgg aggatatgca 
ggtgaaagct cggcgagatg aagaatctgt 
ggaggaagag aagctgaagc agttgatgca 



cgggcgggcg nctcagccat ggccctgcgc 60 
tttaccgcgc tggacgtgga gaagagtggc 12 0 
tcccacaacc tgtacacggt cctgcacatc 180 
ttccgagatg atgatgacgg ccctgtgtcc 24 0 
tacatcctgg acaaggtgga ggagggggct 300 
tggacgctga cggccaagaa gaactatcgg 3 60 
aatcaggatg ccttccgcct ctggtgcctc 42 0 
ctgatcatgg ttcctgatga ggtggaatac 480 
ttggaggtga gcttgggtga gctggaggag 540 
accaccgggg ggctcagcgt ctggcagttc 60 0 
cggggcgtgg gccgggacaa cctcagcatg 660 
caagatgtcc tgaagcaggg ctacctgtgg 72 0 
gaacgctggt tccagctgca gcccagctgc 7 80 
agagaaaagg ggcattatcc cgctggatgc 840 
cgacggaaag cgctgcatgt tctgtgtgaa 900 
ctcagacacg cgccaggcca ggagtggaca 960 
gccgagggga agacgtccct acacaaggac 1020 
cagcgggagc gncgccgggc ggccaaggaa 10 80 
aggagaagga gcggaagtgc aggagctgga 1140 
cggctgctgc aggaggagga ggaacggcgc 12 00 
ctcgagggcc aactgcgcga ggcggagcag 12 60 
ctgaaggagg aggaggctgc ccggcagcgg 13 20 
gcagcggttg caggaggccc tgcaactaga 13 80 
gcgaatcgct cagaccagac tgctggaaga 1440 
gctgaaggag gagcaggagc gctacatcga 1500 
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acgggcgcac aggagaagga agagctgcag 
cagcaggccc agcagcagct ggaggaggtg 
gtggaggctg cccagagaaa actgcgccag 
cagatgaacc ggctgatgca tccaattgag 
tccttctcag gcttccagcc ccctctgctt 
acccgctggg gatcccaggg caacaggacc 
gtccctcaat ggtggggatg aggctcctgc 
ggatccagca ccagaaaatt agcctctctt 
ccaggacctg gccacagctg gcctgtgggt 
aggtcctggt gccaggggcc caggccctcc 
tcacccttca taccagctcc aagccccaga 
gagaacttgg ccctgtgctit: tagacccaag 
caagcaagcc ggggctacct gcccccaggt 
ataaaaactg ctcttagaat gaattattgg 
ctcccttcct ccctcaagcc ccgttatagg 
ggttaagaaa gaccctgcag ctagactgcc 

<210> 13 

<211> 1307 

<212> DNA 

<213> Homo sapiens 



caggagatgg cacagcagag ccgctccctg 1560 
cggcagaacc ggcagagggc tgacgaggat 162 0 
gccagcacca acgtgaaaca ctggaatgtc 1680 
cctggagata agcgtccggt caccagcagc 1740 
gcccaccgtg actcctccct aaagcgcctg 1800 
ccctgcgccc aacagcaatg agcagcagaa 1860 
cccggcttcc acccctcagg aagataaact 192 0 
agccccttgt tcttcccaat gtcatatcca 1980 
gatcccagct cttactagga gagggagctg 2040 
aaccataaac agtccaggat ggaacctggt 2100 
ccangggagc tgtctgggat gttgatcctt 2160 
gacccgattc ctgggctagg aaagagagaa 222 0 
ggccaccaag ttgtggaagc acatttctaa 228 0 
ctcaggtctg tccatctctc ctgccatttc 2340 
ttccaaagag cagtaaaagt ataataaagt 2 400 
tgggttctg 243 9 



<220> 

<221> inisc__f eature 

<223> Incyte ID No: 331642.1- j 

<220> 

<221> unsure 

<222> 891 

<223> a, t, c, g, or other 



<400> 13 

ccggctcgct agccgtcctg cgggacgccg 
gagaacggaa atccagttat caaaattgac 
caatggaaga aattgggaac attatcacaa 
atgtcacagg ttaaaaaaaa gtccttcatg 
ttcagaagct catgaaaaga ggccaccaat 
tcatattaca aatgttggtg agatgaagca 
caacatcgca atcacatatc ccattcagaa 
caaaacccgg gatgcaatac ttcagttgag 
aatccttccc ccattgatgc agaagacaac 
ggattt-atcc tgccttctcc acaagcatgt 
ggcggcagtg cttgcaggga caacagaagc 
attgctitcaa gaccacaagc atcatgacaa 
actgaaatgt catggaattg gagagtatta 
ggactcagca atgtcttgtt tttcggcttc 
caacgactca cagtgctcat ctggtcaatg 
tgttgggatt cttgtttttt ccaattaatg 
gtggggaatt tcagtctttc cccaaggttt 
aactgataaa tcttttcaga ggtgcccatc 
gcataatcaa tgcaacttat gagttcttgt 
gtgccattta tcaactgaat agaccttcta 
tggccaaata caagttggtg tcataactcc 
ttgcttaagc ctcaataaaa cagaataaaa 



gcgctgatgg gttggggaaa tggacgcctg 60 

tcaagaagag agaacctaac agaacaataa 120 

agctatcatc ctgccaaact ccaggctcag 180 

aaaaagaaag atcttaagca gcatgatgga 240 

actaacatct tcaaaacacg atatatcacc 3 00 

ttacttgtgt ggctgctgcg ccgtatcgaa 3 60 

ggtccccttt cgacaacagc tgtatggcat 420 

aagggatgga tttcgaaatt tgtatcgtgg 480 

tacgcctgca cttatgtittg gtctgtatga 540 

cagtgctcca gagtttgcaa ccagtggcgt 600 

aattttcact ccactggaaa gagttcagac 660 

atttaccaac acttaccagg ctttcaaggc 720 

tcgaggttgg tgcccattct tttccggaat 780 

gaggtcccat taaggagcat ctgcctaccg 840 

attttatctg tggaggtcta ntgggtgcca 900 

ttgtaaaaac tcgcatacag tctcagattg 960 

tccaaaaaat ctggctggaa cgggacagaa 1020 

tgaattacca tcggtccctc atctcttggg 1080 

taaaggttat atgaaaaaac catcagttaa 1140 

agaagaatgc agtttggcct ctttcttagt 1200 

aggccacagt gagttatggg caaagctgtt 1260 

gattccaata ggaaaat 13 07 



<210> 14 

<211> 303 

<212> DNA 

<213> Homo sapiens 



<220> 
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PCT/USOO/15344 



<221> misc_feature 

<223> Incyte ID No: 445594.2 .j 

<220> 

<221> unsure 
<222> 184 

<223> a, t, c, g, or other 
<400> 14 

gcgctctcgg cccacacaat atgacctcgg ggaggatgcg aggaagatga actgtgatga 60 
tccacttctt cttaatgaat gactgactta cctgagaaag aaactcagag gaagaggaaa 120 
gaaagaagag gagggaatgg ctctttctca gggactgttt acattcaagg atgtggccat 180 
aganttctct caagaggagt gggagtgcct ggaccctgcc cagagggcct tgtacaggga 2 40 
cgtgatgttg gagaactaca ggaacctgct ttctctcgat gaggataaca tccctccaga 3 00 



aga 



303 
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